Upload
phungkhanh
View
218
Download
0
Embed Size (px)
Citation preview
On the estimation of fractionally integrated On the estimation of fractionally integrated On the estimation of fractionally integrated On the estimation of fractionally integrated processesprocessesprocessesprocesses
By By By By Frank S. NielsenFrank S. NielsenFrank S. NielsenFrank S. Nielsen
A dissertation submitted to
the Faculty of Social Sciences, University of Aarhus
in partial fulfilment of the requirements of
the PhD degree in
Economics and Management
To My Family Til Min Familie
Table of ContentsTable of ContentsTable of ContentsTable of Contents PrefacePrefacePrefacePreface vvvv SummarySummarySummarySummary viiviiviivii Dansk Resume (Danish Summary)Dansk Resume (Danish Summary)Dansk Resume (Danish Summary)Dansk Resume (Danish Summary) xxxxxxxx Chapter 1Chapter 1Chapter 1Chapter 1 1111 A vector autoregressive model for electricity prices subject to long memory and regime switching Chapter 2Chapter 2Chapter 2Chapter 2 25252525 Local polynomial Whittle estimation covering non-stationary fractional processes Chapter 3Chapter 3Chapter 3Chapter 3 65656565 Local polynomial Whittle estimation of perturbed fractional processes Chapter 4Chapter 4Chapter 4Chapter 4 111113131313 Long-run dependencies in return volatility and trading volume
Preface
This thesis was written in the period from February 2006 to January 2009 while I was a PhD student
at the School of Economics and Management, Aarhus University and during my visit at Cornell
University, New York, USA. I am grateful to the school, to the Danish Social Sciences Research
Council (grant no. FSE275-05-0199) for generously �nancial support in connection with courses,
conferences, and the time abroad, and the Center for Research in Econometric Analysis of Time Series
(CREATES), funded by the Danish National Research Foundation for providing excellent research
facilities.
A number of people have contributed to the making of this thesis. First and foremost I thank my
advisor Niels Haldrup, for invaluable encouragement, and for always being there with competent and
contructive comments and suggestions. I have bene�ted greatly from our discussions. In addition, I
am thankful to Morten Ørregaard Nielsen. Without his advice, comments, suggestions, and friendship
this PhD thesis wouldn�t be the same.
From August 2007 to February 2008 I visited Department of Economics, Cornell University, New
York, USA. I would like to thank the department for their hospitality. Additionally, thanks goes to
Morten Ørregaard Nielsen for making the stay extremely pleasant, especially for the non-academic
discussions and all the hours in the gym, where he didn�t beat me once on the treadmill.
At Aarhus University I would like to thank faculty and fellow students. Special thanks goes to the
table football (foosball) team, i.e. Allan, Christian, Claus, Jonas, Niels, Rune, and Torben. I would
also like to send a special thanks to my o¢ ce mate and good friend Niels Skipper with whom I have
shared lots of academic and not so academic discussions, and of most importance many visits to the
Friday Bar.
Finally, a very special thanks to my closest family for putting up with me through the years and
their unconditional love.
Frank S. Nielsen, Aarhus, January 2009
Updated preface
The pre-defence took place on March 9, 2009 in Aarhus. I am grateful to the members of the assessment
committee, Allan Würtz, Tom Engsted (sitting in for Allan Würtz at the pre-defence), Javier Hualde,
and Jörg Breitung, for their careful reading of the thesis and their many useful and constructive
comments and suggestions. Some of the suggestions have been incorporated in the present version of
the thesis while others remain for future work on the chapters.
Frank S. Nielsen, Aarhus, March 2009
v
Summary
This thesis is concerned with time series modeling where the unifying theme is the treatment of long
memory and fractional integration.1 The aim of the thesis is to further develop methods of inference
for long memory models. In particular, the focus is on estimation as well as applications of the derived
theory on economic data.
Empirical evidence of fractional integration, and more generally long memory, have been around
for a long time in various �elds, such as astronomy, chemistry, agriculture, and geophysics, but it
was not until the seminal works of Granger (1980), Granger & Joyeux (1980), and Hosking (1981)
that long memory and fractional integration were introduced in economics. The past decades have
witnessed an increasing interest in fractionally integrated models as a convenient and parsimonous way
to capture the long memory properties of many time series. Long memory and fractionally integrated
processes are regarded as halfway representations between models where the correlations decay at an
exponential rate, i.e. short-memory models (e.g. autoregressive moving average models) and unit
root models which exihibits no mean reversion. The autocorrelation of a long memory or fractionally
integrated process decays at a slower hyperbolic rate. This permits parsimonious representation of
series which exhibit non-zero autocorrelation at high lags. Empirically, there is now a broad range of
applications showing the relevance of long memory and fractional integration. For instance, in �nance
(e.g. Baillie, Bollerslev & Mikkelsen (1996), Breidt, Crato & de Lima (1998), Andersen, Bollerslev,
Diebold & Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001, 2003), and Bandi & Perron
(2004)), and in macroeconomics (e.g. Diebold & Rudebusch (1989, 1991), Sowell (1992), and Gil-
Alana & Robinson (1997)). See Robinson (1994), Baillie (1996), and Henry & Za¤aroni (2003) for
three excellent surveys.
The thesis consists of four self-contained chapters, two-single-authored, and two written with co-
authors. In chapter 1, together with Niels Haldrup and Morten Ørregaard Nielsen, we consider a regime
dependent vector autoregressive model, where we allow for state dependent fractional integration as
well as the possibility of state dependent fractional cointegration. The proposed model is relevant
in describing the price dynamics of electricity prices when the transmission of power is subject to
occasional congestion periods. For a system of bilateral prices non-congestion means that electricity
prices are identical whereas congestion makes prices depart. Hence, the joint price dynamics implies
switching between essentially a univariate price process under non-congestion and a bivariate price
process under congestion. At the same time it is an empirical regularity that electricity prices tend
to show a high degree of fractional integration, and thus that prices may be fractionally cointegrated.
We apply the model to an analysis of the area prices in Nord Pool. The analysis extends the approach
used in Haldrup & Nielsen (2006) as we model multiple price series jointly. Their analysis is limited
to individual price series and the relative price series are analyzed separately as univariate models.
When the focus of analysis is the potential (fractional) cointegration amongst multiple series, a system
approach is more natural, but given the particular features the model should allow also more complex in
the present context. Given our approach, we �nd that the behavior of electricity prices in geographical
1 In the following we use the terms �long memory process� and �fractionally integrated process� synonymously,
although strictly speaking, a fractional process is just a particular form of a long memory process.
vii
price regions are di¤erent across states. The analysis shows that it is important to condition on
congestion/non-congestion as non-switching models can generate misleading conclusions with regard
to the fractional integration orders and potential fractional cointegration. Three leading types of
misclassi�cation of the model dynamics may arise. First, non-switching models may indicate that
the price series are fractionally cointegrated. Although, when conditioning on states this is only the
case in the non-congestion state (which is cointegrated by de�nition). Secondly, the non-switching
model indicate that there is no fractional cointegration when in fact there is cointegration in the non-
congestion state. Finally, there is the possibility of fractional cointegration in both regimes, but not
in the non-switching model. Conditioning on states is also important when looking at the adjustment
coe¢ cients, as the non-switching models can lead to wrong conclusions about the convergence of
geographical price regions towards equilibrium.
Chapter 2 is concerned with semiparametric estimation of the fractional integration parameter in
the empirically relevant scenario of short-run contamination of the process of interest and potential
non-stationarity. We propose an estimator which extends the local polynomial Whittle estimator
of Andrews & Sun (2004) to fractionally integrated processes covering both stationary and non-
stationary regions. We utilize the extended discrete Fourier transform and periodogram to extend
the local polynomial Whittle estimator to the non-stationary region. By approximating the short-run
component of the spectrum, say ' (�), by a polynomial Pr (d; �) (even and �nite) instead of a constant
in a shrinking neighborhood of frequency zero, we alleviate some of the bias that we usually see in
the classical local Whittle setting. The bias reduction comes at a cost as the variance is in�ated by
a multiplicative constant. Additionally, given that the generating process is linear, the same central
limit theorem argument as in the stationary case jdj < 12 derived by Robinson (1995) holds; although,
not for d0 =�12 ;32 ; :::
. We establish consistency and asymptotic normality for d0 2 (�1=2;1) :
Furthermore, if '(�) is in�nitely smooth near frequency zero, the rate of convergence can become
arbitrary close to the parametric rate. The simulations reveal that our proposed estimator is superior
when considering possible short-run contamination and non-stationary values of d: Finally, an analysis
of credit spreads demonstrates the usefulness of the estimator. We cannot reject that the log of yields
of Aaa, Baa and Treasury bonds contain a unit root. However, the results are more mixed when
looking at spreads, depending on the semiparametric estimator and bandwidth choice. Therefore, as
in Ratta & Urga (2005) we can, for our given data, reject the reduced-form modeling of Das & Tufano
(1996), Jarrow, Lando & Turnbull (1997) and Du¢ e & Singleton (1999), which explicitly implies the
data generating process of the risk-free process, and hence also credit spreads, follows a short-memory
process.
In chapter 3, co-authored with Per Frederiksen and Morten Ørregaard Nielsen, we propose a semi-
parametric local polynomial Whittle with noise (LPWN) estimator of the memory parameter in long
memory time series perturbed by a noise term which may be serially correlated. The proposed estima-
tor allows both the spectrum of the perturbation and the spectrum of the short-memory component of
the signal, i.e. �w(�) and �y(�) to be approximated by polynomials hw(�w; �) and hy(�y; �) of (�nite
and even) orders 2Rw and 2Ry near the zero frequency, instead of constants, thereby obtaining a bias
reduction depending on the smoothness of �w(�) and �y(�) near the origin. The approach taken here
in modeling the short-run dynamics by a polynomial was introduced by Andrews & Sun (2004) for
viii
non-perturbed processes, but is novel in the context of perturbed fractional processes. Our results
show that introducing polynomials hy(�y; �) and hw(�w; �) in�ates the asymptotic variance of the long
memory estimator, d, by a multiplicative constant which depends on the true long memory parame-
ter, d. However, the in�ation decreases when d increases, and we obtain a reduction in the order of
magnitude of the bias if �(�) is su¢ ciently smooth near frequency zero. We show that the estimator
is consistent for d 2 (0; 1), asymptotically normal for d 2 (0; 3=4), and if �(�) is in�nitely smoothnear frequency zero, the rate of convergence can become arbitrary close to the parametric rate. The
Monte Carlo study shows the usefulness of the proposed LPWN estimator. Compared to standard
estimators, such as Hurvich & Ray (2003) local Whittle with noise estimator, the LPWN estimator
can produce considerable bias reductions in practice, especially in cases with short-run dynamics in
both the signal and noise components. We also include an empirical application to the 30 DJIA stocks
where the LPWN estimator indicates stronger persistence in volatility than the standard estimators,
and for most of the stocks produce estimates of d in the nonstationary region.
In chapter 4, we are interested in characterizing the long-run joint volatility-volume relationship in
the context of the Mixture of Distributions Hypothesis (MDH), set forward by Clark (1973), Epps &
Epps (1976), and Tauchen & Pitts (1983). MDH asserts that returns and trading volume are jointly
dependent on the same underlying latent information arrival process. By using the log-periodogram
estimator of Geweke & Porter-Hudak (1983), Bollerslev & Jubinski (1999) show that volatility and
trading volume of the S&P100 common stocks have a similar degree of fractional integration. This
evidence of pairwise correspondence between estimates of long memory across the volatility-volume
series supports a long-run view of the MDH; i.e. both processes are driven by a slowly mean-reverting
fractional integrated latent information process. Instead of using the log-periodogram estimator which
is downward biased, in the context of perturbed fractional processes, we use a semiparametric estimator
that is robust to time series perturbed by a noise term which may be serially correlated. More so
than other studies, our results show evidence of volatility and trading volume being more persistent in
terms of memory. We see this when introducing semiparametric estimators that are robust to potential
short-run contamination in both the signal and noise. Additionally, volatility displays a higher degree
of long memory than trading volume for the S&P100 common stocks, and shows evidence of being
governed by a perturbed fractional process, whereas this is not generally the case for trading volume.
Furthermore, we �nd weak evidence of there being a cointegrating relation between volatility and
trading volume. This is in line with other studies showing that although volatility and volume might
share a common fractional integration order, they do not move together over time.
ix
Dansk resume (Danish summary)
Denne afhandling beskæftiger sig med tidsrækkemodellering, hvor det overordnet tema er behandlingen
af lang hukommelse og fraktionel integration. Formålet med afhandlingen er at videreudvikle metoder
til inferens for lang hukommelse modeller. Især fokuseres der på estimering samt implementering af
teori på økonomisk data.
Empirisk bevis for lang hukommelse processer har eksisteret i mange år indenfor forskellige felter,
såsom astronomi, kemi, landbrug og geofysik, men det var ikke før skelsættende værker af Granger
(1980), Granger & Joyeux (1980) og Hosking (1981), at begreberne lang hukommelse og fraktionel
integration blev introduceret i den økonomiske litteratur. De seneste årtier har oplevet en stigende
interesse for fraktionel integrede tidsrækker som en bekvem og parsimonous måde at indfange de lang
hukommelse egenskaber, som mange tidsrækker indeholder. Lang hukommelse og fraktionelt integr-
eret processer betragtes som halvvejs repræsentationer mellem modeller, hvor korrelationen mellem
observationer over tid henfalder med en eksponentiel hastighed, dvs. kort hukommelse modeller (f.eks
autoregressive glidende gennemsnits modeller) og enhedsrod tilfældet, der udviser ingen tilbagevenden
mod dens middelværdi. En lang hukommelse eller fraktionelt integreret proces henfalder derimod ved
en hyperbolsk hastighed. Dette tillader parsimonous og �eksibel repræsentation af tidsrækker, som
udviser ikke-nul autokorrelation mellem observationer som er langt fra hinanden i tid. Nyere empiriske
forskning viser relevansen af lang hukommelse og fraktionel integration. Af eksempler kan nævnes:
indenfor �nansiering (eks. Baillie et al. (1996), Breidt et al. (1998), Andersen, Bollerslev, Diebold
& Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001, 2003) og Bandi & Perron (2004)), og
indenfor makroøkonomi (eks Diebold & Rudebusch (1989, 91), Sowell (1992) og Gil-Alana & Robin-
son (1997)). Tre glimrende oversigtsartikler er Robinson (1994), Baillie (1996) og Henry & Za¤aroni
(2003).
Afhandlingen indeholder �re uafhængige artikler, to selvstændige og to skrevet med medforfat-
tere. I kapitel 1, skrevet sammen med Niels Haldrup og Morten Ørregaard Nielsen, betragter vi en
regime afhængig vektor autoregressiv model, hvor vi tillader regime afhængig fraktionel integration
såvel som muligheden for regime afhængig fraktionel kointegration. Den foreslåede model er rele-
vant i beskrivelsen af prisdynamikken vedrørende elektricitetspriser, hvor transmissionen af strøm er
underlagt lejlighedsvis kapacitetsbegrænsning. For et system af bilaterale priser betyder ingen ka-
pacitetsbegrænsning, at elpriserne er identiske, hvorimod kapacitetsbegrænsning bevirker, at priserne
afviger fra hinanden. Derfor indebærer den fælles prisdynamik, at vi skifter mellem en univariate
prisproces, når vi ingen kapacitetsbegrænsning har, og en bivariate prisproces når der er kapacitetsbe-
grænsning. Samtidig er det et empirisk faktum, at elpriser udviser en høj grad af fraktionel integration,
og således åbner dette op for at priserne kan være fraktionelt kointegreret. Vi anvender den opstillede
model til at undersøge elpriserne opdelt på de geogra�ske regioner i Nord Pool samarbejdet. Analysen
udvider tilgangen af Haldrup & Nielsen (2006), idet vi modellerer �ere tidsrækker på samme tid. Deres
analyse er begrænset i den forstand at de individuelle prisserier, og den relative prisserie er analyseret
separat som univariate modeller. Når fokus er på potentiel (fraktionel) kointegration mellem multiple
prisserier, så er en system betragtning mere naturlig, men også mere kompleks taget de givne træk
in mente, som en model skal kunne behandle. Generelt �nder vi, at prisdynamikken for de forskellige
x
geogra�ske regioner er forskellig på tværs af regimer. Analysen viser, at det er vigtigt, at betinge
på om der er kapacitetsbegrænsning eller ej. Tre førende typer af fejlklassi�cering af prisdynamikken
kan opstå. Først og fremmest kan modeller hvor vi ingen skift mellem regimer tillader indikere, at
prisserierne er fraktionelt integreret. Hvorimod hvis man betinger på skift mellem regimerne, så er
det kun tilfældet i regimet med ingen kapacitetsbegrænsning (som er kointegreret per de�nition, idet
de to prisserier er identiske). For det andet kan en model uden regime skift indikere, at der ingen
fraktionel kointegration er, hvorimod dette er tilfældet i regimet med ingen kapacitetsbegrænsning.
Til sidst er der tilfældet, hvor der er fraktionel kointegration i begge regimer men ikke i modellen uden
regime skift. Desuden er det også vigtigt, at betinge på om der er kapacitetsbegrænsning eller ej, når
man kigger på justeringskoe¢ cienterne i en vektor autoregressiv model, idet man kan fejlfortolke kon-
vergensen af de geogra�ske prisområder mod ligevægt, hvis man udelukkende fokuserer på modellen
uden regime skift.
Kapitel 2 omhandler semiparametrisk estimation af fraktionel integration i det empirisk relevante
tilfælde, hvor der potentielt er korttidsdynamik og ikke-stationaritet. Vi foreslår en estimator som
udvider local polynomial Whittle estimatoren af Andrews & Sun (2004) til fraktionelle processer som
kan være ikke-stationære. Vi benytter begreberne vedrørende den udvidede Fourier transform og
periodogram til at modi�cere local polynomial Whittle estimatoren. Ved at approximere korttids
komponenten af spektrummet, ' (�), med et polynomie Pr (d; �) (lige og endelig), i stedet for en
konstant i nærheden af nul frekvensen. Ved at gøre dette mindsker vi den bias, der er tilstede i den
klassiske local Whittle estimator. Bias reduktionen har den omkostning at vi in�aterer variansen
med en multiplikativ konstant. Ydermere, givet at den generende process er lineær, så vil central
grænse værdi argumentet udledt af Robinson (1995) holde som i det stationære tilfælde jdj < 1=2,
dog ikke for d0 =�12 ;32 ; :::
. Vi etablerer konsistens og asymptotisk normalitet for d0 2 (�1=2;1).
Yderligere hvis ' (�) er uendelig glat i nærheden af nul frekvensen, så vil konvergenshastigheden
komme arbitrært tæt på den parametriske hastighed. Simulationer viser hvor god vores estimator
er, når der er tilstedeværelse af korttidsdynamik og ikke-stationaritet. Til sidst implementerer vi den
foreslået estimator i en analyse af kreditspænd. Generelt set så kan vi ikke udelukke at logarithmen
til renten på Aaa, Baa og statsobligationer indeholder en enhedsrod. Dog er resultaterne ret blandet
hvis vi kigger på spændene i stedet. Dette leder os til ligesom Ratta & Urga (2005), at givet vores
speci�kke setup, så kan vi forkaste den reducerede form modellering (Das & Tufano (1996), Jarrow
et al. (1997) og Du¢ e & Singleton (1999)) som eksplicit implicerer at den data generende proces for
den risiko-frie proces, og derved kreditspændene, følger en kort hukommelse proces.
I kapitel 3, sammen med Per Frederiksen og Morten Ørregaard Nielsen, foreslår vi en semipara-
metrisk local polynomial Whittle with noise (LPWN) estimator til estimation af hukommelses para-
meteren i lang hukommelse tidsrækker, hvor vi har tillagt støj som kan være seriel korreleret. Den
foreslåede estimator tillader at både spektrummet af det tillagte støjled og spektrummet af signalet,
dvs. �w(�) og �y(�), kan approksimeres af hy(�y; �) og hw(�w; �) af (lige og endelig) orden 2Rw og 2Ryi nærheden af nul frekvensen, i stedet for konstanter, og derved opnås en bias reduktion som afhænger
af glatheden af �w(�) og �y(�). Denne tilgang til modellering af kortsigtsdynamikken er magen til den
anvendt af Andrews & Sun (2004) for ikke-tillagt støj fraktionelle processer, men den er ny i denne
her sammenhæng. Vores resultater viser, at introduktionen af polynomier in�aterer den asymptotiske
xi
varians for lang hukommelses parameteren, d, med en multiplikativ konstant som afhænger af den
sande lang hukommelses parameter, d. Dog vil denne in�atering falde ved stigende d, og vi opnår
herved en ordensreduktion af biasleddet, hvis �(�) er tilstrækkelig glat i nærheden af nul frekvensen.
Vi viser, at estimatoren er konsistent for d 2 (0; 1) og asymptotisk normal for d 2 (0; 3=4), og hvis�(�) er uendelig glat i nærheden af nul frekvensen, så vil konvergens hastigheden komme arbitrært
tæt på den parametriske hastighed. Monte Carlo studiet viser brugbarheden af den foreslåede LPWN
estimator. Sammenlignet med standard estimatorer, det være sig local Whitle with noise estimatoren
af Hurvich & Ray (2003), så er LPWN estimatoren i stand til at opnå betragelige bias redukioner,
specielt i de tilfælde hvor vi har korttidsdynamik i både signalet og det tillagte støjled. Vi har også
inkluderet et empirisk studie af aktierne i DJIA30, hvor LPWN estimatoren indikerer stærkere per-
sistens i volatilitet end standard estimatorer gør. For de �este aktier estimeres d i det ikke-stationære
område.
I kapitel 4 er vi interesserede i at karakterisere forholdet mellem volatilitet og volumen på lang sigt i
henhold til Mixture of Distributions Hypothesis (MDH), fremsat af Clark (1973), Epps & Epps (1976)
og Tauchen & Pitts (1983). MDH hævder at afkast og handelsvolumen er underlagt den samme latente
informations ankomstsproces. Ved at benytte log-periodogram estimatoren af Geweke & Porter-Hudak
(1983) viste Bollerslev & Jubinski (1999), at volatilitet og handelsvolumen for S&P100 har en fælles
grad af fraktionel integration. Dette bevis for parvis sammenhæng mellem lang hukommelse estimater
på tværs af volatilitet-volumen serierne understøtter en langsigtsfortolkning af MDH; dvs. at begge
processer er drevet af latent information proces som udviser lang hukommelse. I stedet for at benytte
log-periodogram estimatoren, som vides negativt biased i fraktionel integrede med tillagt støj processer,
benytter vi en semiparametrisk estimator som er robust overfor tidsrækker med tillagt støj, som
potentielt kan være serielt korrelerede. Ved at benytte disse semiparametriske estimatorer, så er der
bevis for at volatilitet og volumen er mere persistent end vist andet steds i litteraturen. Ydermere
udviser volatilitet højere grad af lang hukommelse end volumen for aktierne i S&P100. Desuden
udviser volatiliteten tegn på at være en fraktionelt integreret proces med tillagt støj, hvorimod dette
ikke ser ud til at være tilfældet for handelsvolumen. Derudover �nder vi svage tegn på at der er en
kointegrerende relation mellem volatilitet og handelsvolumen. Dette er i tråd med andre studier, som
viser at selvom volatilitet og handelsvolumen udviser samme grad af fraktionel integration, så bevæger
de sig ikke sammen over tid.
References
Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), �The distribution of realized stock
return volatility�, Journal of Financial Economics 61, 43�76.
Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), �The distribution of realized
exchange rate volatility�, Journal of the American Statistical Association 96, 42�55.
Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2003), �Modelling and forecasting realized
volatility�, Econometrica 71, 579�625.
xii
Andrews, D. W. K. & Sun, Y. (2004), �Adaptive local polynomial Whittle estimation of long-range
dependence�, Econometrica 72, 569�614.
Baillie, R. T. (1996), �Long memory processes and fractional integration in econometrics�, Journal of
Econometrics 73, 5�59.
Baillie, R. T., Bollerslev, T. & Mikkelsen, H. O. (1996), �Fractionally integrated generalized autore-
gressive conditional heteroscedasticity�, Journal of Econometrics 74, 3�30.
Bandi, F. M. & Perron, B. (2004), �Long memory and the relation between implied and realized
volatility�, Manuscript, Université de Montréal .
Bollerslev, T. & Jubinski, D. (1999), �Equity trading volume and volatility: Latent information arrivals
and common long-run dependencies�, Journal of Business and Economic Statistics 17, 9�21.
Breidt, F. J., Crato, N. & de Lima, P. (1998), �The detection and estimation of long memory in
stochastic volatility�, Journal of Econometrics 83, 325�348.
Clark, P. K. (1973), �A subordinated stochastic process model with �nite variance for speculative
prices�, Econometrica 41, 135�155.
Das, S. & Tufano, P. (1996), �Pricing credit-sensitive debt when interest rates, credit ratings and credit
spreads are stochastic�, Journal of Financial Engineering 5, 161�198.
Diebold, F. X. & Rudebusch, G. D. (1989), �Long memory and persistence in aggregate output�,
Journal of Monetary Economics 24, 189�209.
Diebold, F. X. & Rudebusch, G. D. (1991), �Is consumption too smooth�, Review of Economics and
Statistics 73, 1�9.
Du¢ e, D. & Singleton, K. J. (1999), �Modeling term structures of defaultable bonds�, Review of
Financial Studies 12, 687�720.
Epps, T. & Epps, M. (1976), �The stochastic dependence in security price changes and transaction
volumes: implications for the mixture-of-distribution hypothesis�, Econometrica 44, 305�321.
Geweke, J. & Porter-Hudak, S. (1983), �The estimation and application of long-memory time series
models�, Journal of Time Series Analysis 4, 221�238.
Gil-Alana, L. A. & Robinson, P. M. (1997), �Testing of unit root and other non-stationary hypotheses
in macroeconomic time series�, Journal of Econometrics 80, 241�268.
Granger, C. W. J. (1980), �Long memory relationships and the aggregation of dynamic models�, Journal
of Econometrics 14, 227�238.
Granger, C. W. J. & Joyeux, R. (1980), �An introduction to long-memory time series models and
fractional di¤erencing�, Journal of Time Series Analysis 1, 15�29.
xiii
Haldrup, N. & Nielsen, M. Ø. (2006), �A regime switching long memory model for electricity prices�,
Journal of Econometrics 135, 349�376.
Henry, M. & Za¤aroni, P. (2003), The long range paradigm for macroeconomics and �nance, in
P. Doukhan, G. Oppenheim & M. S. Taqqu, eds, �Theory and Applications of Long-Range De-
pendence�, Birkhäuser, Boston, pp. 417�438.
Hosking, J. R. M. (1981), �Fractional di¤erencing�, Biometrika 68, 165�176.
Hurvich, C. M. & Ray, B. K. (2003), �The local Whittle estimator of long-memory stochastic volatility�,
Journal of Financial Econometrics 1, 445�470.
Jarrow, R. A., Lando, D. & Turnbull, S. M. (1997), �A markov model for the term structure of credit
risk spreads�, Review of Financial Studies 10, 481�523.
Ratta, L. D. & Urga, G. (2005), Modelling credit spread: A fractional integration approach, Working
Paper CEA-07-2005, Centre for Econometric Analysis, Cass Business School.
Robinson, P. M. (1994), Time series with strong dependence, in C. Sims, ed., �Advances in Economet-
rics�, Cambridge University Press, Cambridge, pp. 47�95.
Robinson, P. M. (1995), �Gaussian semiparametric estimation of long range dependence�, The Annals
of Statistics 23, 1630�1661.
Sowell, F. B. (1992), �Modeling long-run behavior with the fractional arima model�, Journal of Mon-
etary Economics 29, 277�302.
Tauchen, G. E. & Pitts, M. (1983), �The price variability-volume relationship on speculative markets�,
Econometrica 51, 485�505.
xiv
Chapter 1Chapter 1Chapter 1Chapter 1
A vector autoregressive model for electricity prices subject to long memory and regime switching
A vector autoregressive model for electricity prices subject to long
memory and regime switching�
Niels Haldrupy
Aarhus University and CREATESFrank S. Nielsenz
Aarhus University and CREATES
Morten Ørregaard Nielsenx
Queen�s University and CREATES
March 20, 2009
Abstract
A regime dependent VAR model is suggested that allows long memory (fractional integration) in
each of the observed regime states as well as the possibility of fractional cointegration. The model
is relevant in describing the price dynamics of electricity prices where the transmission of power
is subject to occasional congestion periods. For a system of bilateral prices non-congestion means
that electricity prices are identical whereas congestion makes prices depart. Hence, the joint price
dynamics implies switching between essentially a univariate price process under non-congestion
and a bivariate price process under congestion. At the same time it is an empirical regularity that
electricity prices tend to show a high degree of fractional integration, and thus that prices may be
fractionally cointegrated. An empirical analysis using Nord Pool data shows that even though the
prices strongly co-move under non-congestion, the prices are not, in general, fractional cointegrated
in the congestion state.
Keywords: Cointegration, electricity prices, fractional integration, long memory, Markov switch-
ing.
JEL Classi�cation: C32.
�We are grateful to Javier Hualde for valuable suggestions and comments. Previous versions of this paper have beenpresented at the DGPE workshop in Sandbjerg, October 2006, the CREATES long memory symposium in Aarhus,July 2007, the ESRC econometrics study group meeting in Bristol, July 2007, the NBER-NSF time series conference inIowa, September 2007, and seminars at the University of Aarhus. We are grateful to the participants for comments andsuggestions. This work was partly done while F. S. Nielsen was visiting Cornell University, their hospitality is gratefullyacknowledged. We are also grateful for �nancial support from the Danish Social Sciences Research Council (grant no.FSE 275-05-220) and from CREATES funded by the Danish National Research Foundation.
yCREATES, School of Economics and Management, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark.email: [email protected].
zCREATES, School of Economics and Management, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark.email: [email protected].
xDepartment of Economics, Dunning Hall Room 307, 94 University Avenue, Queen�s University, Kingston, OntarioK7L 3N6, Canada; email: [email protected]
3
1 Introduction
Over the past decade or so electricity markets have been strongly liberalized throughout the world.
In particular, the Nordic power market consisting of Norway, Sweden, Finland, and Denmark has
developed remarkably towards liberalization and the establishment of competitive market conditions,
and today this market serves as a model for the restructuring of other power markets. The Nordic
power market is characterized by a grid of physical exchanges of power across geographical regions
where the actual exchange is constrained by the �ow capacity. Naturally, this has implications for
the way prices are formed: When there are no bilateral capacity restrictions then there is a free �ow
of power, and prices will be identical. On the other hand, when there is congestion prices tend to
depart to meet the supply and demand conditions subject to restricted access to power from other
regions. In order to model electricity prices it is thus natural to consider regime dependent price
processes re�ecting the presence or absence of �ow congestion. This particular feature of the market
has been addressed in recent work by Haldrup & Nielsen (2006a, 2006b). Another important property
of electricity prices modeled in these works is the presence of long memory. Statistical tests strongly
reject price series to be I(0) and I(1), whereas I(d) processes with d being fractional (see Granger
(1980), Granger & Joyeux (1980) and Hosking (1981)) provide a nice characterization of the data.
The combination of fractional integration and regime switching gives rise to some challenges. Ding
& Granger (1996), Diebold & Inoue (2001), and Granger & Hyung (2004) argue that under certain
conditions time series variables can spuriously have long memory when measured in terms of their
fractional order of integration, when in fact the series exhibit non-linear features such as regime
switching. In the model framework of Haldrup & Nielsen (2006a, 2006b) separate long memory price
dynamics is allowed in adjacent power regions depending upon whether the power exchange is subject
to congestion or non-congestion. The model is of the Markov switching type originally de�ned by
Hamilton (1989). However, because the de�ning property of e.g. a non-congestion state is that prices
are identical, the state variable is observable as opposed to being a latent variable. An important
feature of the model is that the price processes in the di¤erent regimes can have di¤erent degrees of
long memory, which gives rise to a number of interesting possibilities. For instance, consider the state
with non-congestion and assume that the associated bivariate prices are fractionally integrated of a
given order. It follows that prices are fractionally cointegrated in this case, i.e. extending the notion of
Granger (1981, 1986) and Engle & Granger (1987), in the sense that individual prices are fractionally
integrated but price di¤erences are identically zero. Thus, an extreme form of cointegration occurs
in this situation because the prices are identical and hence are governed by exactly the same price
shocks. The price behavior in the congestion state can (and typically will) be very di¤erent. That is,
the bivariate prices can be fractionally cointegrated in a more conventional way or the prices can appear
not to cointegrate. Hence, the model can potentially exhibit state dependent fractional cointegration.
By not appropriately conditioning on the congestion state, i.e. when having a model with no regime
switching, the full sample estimates are likely to be a convex combination of the behavior in the
individual states and hence misleading inference is likely to result. In fact, this is one of the major
empirical �ndings in Haldrup & Nielsen (2006b).
The modeling approach used in Haldrup & Nielsen (2006b) is limited in the sense that the individual
4
price series and the relative price series are analyzed separately as univariate models. When the focus
of analysis is the potential (fractional) cointegration amongst multiple series a system approach is
more natural, but clearly also more complex in the present context given the particular features the
model should allow. In principle, the full set of price series should be modeled jointly, and, depending
upon the market conditions, should shrink to a limited number of price series re�ecting periods with
non-congestion at some grid points.
We distinguish between price areas and geographical regions. Each geographical region corresponds
to a physical exchange (e.g., West Denmark, South Norway, etc.) and is therefore constant over time.
On the other hand, a price area is de�ned simply as an area with the same price and may therefore
clearly change over time. Thus, West Denmark and South Norway always constitute two geographical
regions, but in the case of non-congestion the same price prevails in both geographical regions and
they hence constitute just one price area.
In this paper we model multiple price series jointly in a vector autoregression (VAR), which allows
for fractionally integrated time series that potentially cointegrate in the congestion state. In the non-
congestion state, prices are identical by de�nition and hence a univariate model for the price process is
applied in this particular regime. Thus, our VAR model for fractionally cointegrated processes allows
for the possibility of regime switching, and in particular di¤ers from other speci�cations o¤ered in the
literature in the sense that our VAR model collapses to a pseudo-univariate model when a speci�c
state arises.
There are di¤erent reasons why the identi�cation of separate price dynamics is important. The
operation of electricity markets is similar to the operation of �nancial markets with electricity power
derivatives being priced and traded in highly competitive markets and hence appropriate modeling
of both means and variances is crucial. Furthermore, the price dynamics is of interest with respect
to competition analysis of electricity markets where market delineation is a central issue, see e.g.
Sherman (1989) and Motta (2004). Even though most power markets are highly liberalized there
is still scope for regulating authorities to closely follow the market behavior, see also Fabra & Toro
(2002). Under non-congestion there is obviously a single price existing in the market and the relevant
geographical market consists of the regions with identical prices. However, when there is congestion
it is of interest to follow the price dynamics closely because suppliers can have a dominating position.
The geographical market delineation thus becomes less straightforward in this case. If the price
dynamics appears to be very di¤erent there is scope for further examination of the market conditions
by regulatory authorities.
In our empirical analysis we �nd that generally the behavior of electricity prices in geographical
price regions are di¤erent across states. The analysis shows that it is important to condition on
congestion/non-congestion as non-switching models can generate misleading conclusions with regard
to the fractional integration orders and potential fractional cointegration. Three leading types of
misclassi�cation of the model dynamics may arise. First, non-switching models may indicate that the
price series are fractionally cointegrated, whereas when conditioning on states this is only the case
in the non-congestion state (which is cointegrated by de�nition). Secondly, the non-switching model
could indicate that there is no fractional cointegration when in fact there is cointegration in the non-
congestion state. Finally there is the possibility of fractional cointegration in both regimes, but not
5
in the non-switching model. Conditioning on states is also important when looking at the adjustment
coe¢ cients, as the non-switching models can lead to wrong conclusions about the convergence of
geographical price regions towards equilibrium.
The remainder of the paper is structured as follows: We next o¤er a brief description of the
structure of the Nordic electricity market. Section 3 introduces the data and argues for the importance
of allowing for long memory, regime switching and seasonality when building a model to describe the
geograhical region price processes. In section 4 the VAR modeling framework with long memory and
regime switching is presented. In section 5 the empirical results are discussed and section 6 concludes.
2 The operation of the Nordic power market
Within the Nordic countries (Denmark, Finland, Norway, and Sweden), major electricity reforms
were implemented during the 1990s. The deregulation process started in Norway in 1991, continued
in Sweden 1996, in Finland 1998, and was �nally completed in Denmark in 2000. As part of the
liberalization the national electricity markets were opened up for cross-border trade by establishment
of a common power exchange, Nord Pool. Today all member countries of the Nordic power market
have adapted to the new competitive environment and the Nordic exchange serves as a model for the
restructuring of other power markets throughout the world.1
The per capita consumption of electricity is very high in Norway and Sweden, slightly lower in
Finland and at EU average in Denmark. The relatively high consumption level in the Nordic countries
is caused by a relatively electricity intensive industrial production, a cold climate, and extensive use of
electric heating in homes and o¢ ces, especially in Norway and Sweden. The sources of electricity power
production are rather mixed in the Nordic area as a whole. The major energy source is hydropower
supplying approximately 65% of total electricity in years with normal precipitation. On the national
level the power generation systems di¤er signi�cantly and are generally dominated by one or two
technologies. In Norway the share of hydropower is close to 100%, in Sweden it is close to 50%,
in Finland around 15% and in Denmark 0%. With respect to nuclear power the share is 50% in
Sweden 30% in Finland, and 0% in Denmark and Norway. Power generation from fossil fuels is of
major signi�cance in Denmark and Finland, minor in Sweden, and close to non-existent in Norway.
In Denmark 15-20% of the power supply originates from wind power turbines.2
Because hydropower production is mainly found in the northern parts of the Nordic power web
and thermal power plants are located in the south, the relatively cheap hydropower generation is
transmitted to the heavily populated southern region which of course requires a well established
power grid transmission capacity to facilitate the �ow. When the reservoir levels are adequate, the
less costly hydropower production causes low spot prices. In these cases national and cross-border
transmission systems will be used to their capacity in order to level out price discrepancies across
regions. On the other hand, when reservoir levels are low there will be a net �ow from south to north,
and the market will see relatively high prices for thermally generated electricity.
1For a detailed description of the Nordic power market, see NordPool (2003a) or Amundsen & Bergman (2007).2 Increasing the relative production of electricity by renewable energy sources has considerable political focus in
Denmark. According to o¢ cial energy plans 50% of the Danish electricity production will come from wind power in2030.
6
From an institutional point of view there is a common Nordic market for electricity; however, even
though key market institutions are common this does not mean that the Nordic electricity market is an
integrated market in the sense that �the law of one price�applies. The reason is that the transmission
of power is subject to possible capacity constraints. The Nordic electricity market constitutes a
number of distinct geographical regions di¤erent from the countries themselves and several price areas
may coexist. Whenever the relevant interconnector capacity is insu¢ cient, the Nord Pool area is
divided into two or more price areas. The separate power regions consist of Sweden (SWE), Finland
(FIN), West Denmark (WDK), East Denmark (EDK), North Norway (NNO), Mid Norway (MNO),
and South Norway (SNO). Thus Denmark and Norway are each divided into multiple geographical
regions in Nord Pool.3 This division re�ects the grid of physical exchanges of power and the bidding
areas with respect to the pricing of electricity as we shall explain shortly. Figure 1 displays the actual
electricity exchange points.
Figure 1 about here
The power spot market4 operated by Nord Pool Spot A/S is an exchange where market participants
trade power contracts for physical delivery the next day. This is referred to as a day-ahead market.
The spot market is based on an auction with bids for purchase and sale of power contracts of one
hour duration covering the 24 hours of the following day. At the deadline for the collection of all buy
and sell orders the information is gathered into aggregate supply and demand curves for each power
delivery hour. From these supply and demand curves the equilibrium spot prices - referred to as the
system prices - are calculated.5 Therefore, the system price is determined under the assumption that
no transmission constraint is binding, and thus in a situation where no grid congestions exist across
neighboring interconnectors there will be a single identical price across the areas with no congestions.
The actual trade is not necessarily carried out at the system price. When there is insu¢ cient
transmission capacity in a sector of the grid, a grid congestion will arise and the market system will
establish di¤erent price areas across the geographical division of the Nord Pool area. The Nordic
market is then partitioned into separate bidding areas which therefore become separate price areas
when the contractual �ow between bidding areas exceeds the capacity allocated by the transmission
system operators for spot contracts. Within each price area the buyers pay, and the generators are
paid, the corresponding area price. The di¤erence between the area prices in two adjacent price
areas determines the congestion charge. Because separate prices may coexist depending upon regional
supply and demand conditions, the relevant market de�nition will vary with time. In practice, several
price area combinations will occur. Some hours there will only be a single price area (given by the
system price), other hours there will be two or more price areas.
3For the purpose of analysis of the Norwegian regions, only the SNO link is considered in the present paper.4Since only the spot market will be relevant for the present study, only this market will be described here, see also
NordPool (2003b). NordPool (2003c) describes the futures and forward markets of the Nordic power exchange which areused for price hedging and risk management.
5The system price is the reference price in the �nancial power contracts like futures, forwards, and options traded atNord Pool.
7
3 Data
The data used in this paper are (log transformed) hourly electricity spot prices for the Nord Pool area;
West Denmark (WDK), East Denmark (EDK), South Norway (SNO), Sweden (SWE) and Finland
(FIN).6 The data set is the same as that analyzed in Haldrup & Nielsen (2006a, 2006b) and covers the
period 3 January 2000 to 25 October 2003, including weekends and holidays. This yields a total of
33,404 observations. For EDK the sample period starts 1 October 2000 and thus covers 26,880 sample
points. The data series are displayed in Figure 2. Some stylized facts about the data are reported in
Haldrup & Nielsen (2006b).
Figure 2 about here
A pronounced characteristic of electricity markets is the abrupt and generally unanticipated ex-
treme changes in spot electricity prices. These jumps or spikes generally occur within a very short
period of time, implying that the general level of the di¤erent series tend to be highly persistent possi-
ble with mean reversion, see Escribano, Peña & Villaplana (2002), Haldrup & Nielsen (2006a, 2006b)
and Koopman, Ooms & Carnero (2007). In Haldrup & Nielsen (2006b) a range of tests document that
prices are neither I(0) nor I(1). Estimating the memory parameter for fractionally integrated, FI(d),
processes shows that the series generally exhibit long memory with d in the range 0.31-0.52 with the
SNO area being most persistent and in fact being nonstationary. The remaining areas have estimates
of d in the stationary region. It should be noted, however, that these estimates do not allow for regime
dependence.
Another important aspect of electricity prices is the very strong seasonal behaviour characterizing
the series. Seasonality is mainly driven from the demand side and appears as seasonal variation within
the day, within the week, and over the year. However, the supply side also contributes to seasonal
variation as electricity production is highly dependent upon weather conditions. In particular, the
seasonal variation in precipitation a¤ects water reservoir levels in the generation of hydropower, and
seasonal variation in wind conditions also plays an increasing role due to the growing number of wind
turbines, especially in West Denmark.
Figure 3 about here
In Figure 3 scatter plots of log prices for adjacent Nord Pool areas are shown. When there
are no capacity contraints across neigboring regions the prices will be identical, whereas congestion
makes prices di¤er. Observations on the 45� line therefore represent non-congestion hours, whereas
observations o¤ the 45� line represent congestion hours. It is especially this marked di¤erence in
observations that motivates the present analysis.
6Mid and North Norway are also member areas of Nord Pool, but are left out from the present analysis because theseareas coincide with South Norway for most of the year.
8
4 Modeling of regime dependent long memory
4.1 A univariate model
We here brie�y discuss the univariate model setup used in Haldrup & Nielsen (2006b). The main
features that the estimation model should allow include seasonality, long memory, and regime switching
of the type described above. Assume that individual electricity prices across adjacent regions are
fractionally integrated in the non-congestion state. This means that an extreme form of fractional
cointegration will exist in this state because the prices are identical across the two areas and thus
price di¤erences will be identically zero. On the other hand, the behavior of the two individual price
series in the congestion state can be very di¤erent. If prices are compared without considering the
di¤erent regime possibilities it is unclear what to expect from the data. However, the mixing of the
two processes is likely to produce price series with a behavior that is a convex combination of the two
state processes.
Consider the following model speci�cation, which we denote a regime switching multiplicative
RS-SARFIMA7 model:
Ast (L)�1� astL24
��dst
�yt � �st
�= "st;t; "st;t � nid
�0; �2st
�: (1)
Here, �dst is the fractional di¤erence operator, Ast (L) is a lag polynomial, and st 2 fc; ncg denotes theregime (c :congestion, nc: non-congestion), determined by a Markov chain with transition probabilities
P =
"p11 1� p11
1� p22 p22
#: (2)
Thus, for example, p11 denotes the probability that a congestion state will follow a congestion state,
i.e. Pr (st = cjst�1 = c). Note that because identical prices mean that we are in a non-congestionstate, all regimes are observable, which contrasts the standard regime switching model of Hamilton
(1989) where the regimes follow a latent Markov process.
The (univariate) series yt may denote one of two individual log price series or the associated log
relative price. The series yt has been corrected for deterministic seasonality prior to the estimation
whilst allowing interaction with the two observable regimes, that is, the coe¢ cients on the dummy
variables are allowed to di¤er across states. When yt denotes a log relative price, all parameters are
put to zero when st = nc; including �2nc. Estimation of the above model is by conditional maximum
likelihood and is discussed in detail in Haldrup & Nielsen (2006b).
4.2 A bivariate model
A disadvantage of the model described above is that parameters are estimated separately when in
fact the price series to a large extent are governed by the same price shocks. We therefore consider
the following fractional error correction model speci�cation for a bivariate regime switching vector
7RS-SARFIMA: Regime Switching Seasonal Autoregressive Fractionally Integrated Moving Average.
9
stochastic process subject to being in the congestion state: �d1 0
0 �d1
! p1t
p2t
!=
�1
�2
!� (p1;t�1 � p2;t�1) (3)
+
kXi=1
�c;i��st�i
p1;t�i
p2;t�i
!+ "c;t;
where "c;t � N(0;) and
�c;i =
"�c11;i �c12;i�c21;i �c22;i
#;
��st�i =
(diag
��dc ;�dc
�if st�i = c;
�dnc if st�i = nc;
such that the lagged fractional di¤erences re�ect whether a particular observation is associated with a
congestion or non-congestion state. Thus, dnc is the common fractional integration order in the non-
congestion state, whereas d1 and d2 are the integration orders of the two price areas in the congestion
state (cointegration requires that the two price areas integration order be identical, i.e. d1 = d2 = dc).
In the non-congestion state bilateral prices are identical, p1t = p2t = pt; and hence the bivariate
setup collapses to a pseudo-univariate model, i.e.
�dncpt =kXi=1
�nc;i��st�i
p1;t�i
p2;t�i
!+ "nc;t (4)
where "nc;t � N(0; �2) and�nc;i =
��nc11;i;�
nc12;i
�:
Essentially, the price process switches between being generated from (3) or (4) where switching takes
place in accordance with the transition probabilities (2).
We limit our study to the bivariate setup and disregard potential spill-overs from the other areas.
From a theoretical point of view, it is conceptually easy to extend the present bivariate model to the
multivariate case, and thereby model spill-overs using more advanced dynamics. However, from a
computational point of view this appears infeasible as the number of regimes, and thereby the number
of parameters, grows very fast. Indeed, in a multivariate setup withM geographical regions, there are
2M�1 di¤erent regimes.
A number of remarks are in order. Consider �rst the non-congestion state. In this regime the two
price series are forced to be governed by the same process (4) and hence any conditional forecast for
this regime will remain identical for both price series. This feature is not captured in the univariate
model of Haldrup & Nielsen (2006b) and indeed requires our multivariate setup. Thus, in particular,
forecasts of each price series in the non-congestion state may appear di¤erent when based on (1),
whereas forecasts based on (4) will be identical for the two price series in the non-congestion state.
Note that in the non-congestion state the prices are fractionally integrated of order dnc and fractionally
cointegrated in the sense that the series perfectly co-move. This notion of (fractional) cointegration
10
is somewhat di¤erent than originally suggested by Granger (1986) and Engle & Granger (1987).
Next, consider the congestion regime. We will discriminate between two situations, i.e. when p1tand p2t cointegrate or do not cointegrate. (i) Assume �rst the situation with fractional cointegration.
In this case it must hold that d1 = d2 = dc, i.e. the price series have to be of the same order of
fractional integration. Notice that whilst the single price series are FI(d), the log relative prices are
FI( ) where < d: At the same time we require that (�1; �2)0 6= (0; 0)0 with either �1 < 0 and/or
�2 > 0 such that the model is truly error correcting. (ii) When prices do not cointegrate in the
congestion regime nothing guarantees that d1 = d2 = d: Most importantly, there is no error correction
towards equilibrium in this case and the usual interpretation of the parameters (�1; �2)0 and is
invalid.
The adjustment coe¢ cients, (�1; �2)0; may give an indication of whether the speci�c price areas
adjust towards equilibrium, which we expect them to do under cointegration. Speci�cally, if �1 > 0
then p1t is moving away from equilibrium (non-congestion), whereas if �2 > 0 then p2t is moving
towards equilibrium. Note that the full stability of the model requires that the entire system dynamics
is included in the calculation, but in any case the values of �1 and �2 give a rough idea of the
system dynamics under a ceteris paribus assumption. An alternative interpretation of the adjustment
coe¢ cients follows from the market setup and varying costs of electricity production in di¤erent
geographical regions. For example, if there is no congestion between SNO and WDK prices are
identical and electricity �ows from the cheaper area (usually SNO because of the hydropower) to the
more expensive area (WDK). However, if there is congestion, prices in WDK will be higher re�ecting
the higher costs of electricity production. This increase in price in WDK corresponds to �1 > 0 in
the WDK-SNO bivariate model, i.e. a move away from equilibrium. Importantly, this is not due to
system instability but rather to electricity being more expensive to produce in WDK compared to
SNO.
The model analyzed in this paper is unique in the literature on regime switching and/or (frac-
tionally) cointegrated models since it collapses to a pseudo-univariate model in one of the regimes.
The error correction model speci�cation (3)-(4) re�ects the particular structure and features of the
market design. For discussions of representation theory in the context of (non-switching) fractional
cointegration, see Granger (1986), Davidson (2002), Robinson & Yajima (2002), Davidson, Peel &
Byers (2005), and Johansen (2008).
4.3 Estimation
In our case congestion/non-congestion is an observed state such that regimes are observable, and the
maximum likelihood estimates of the transition probabilities are
p11 =nc;c
nc;c + nc;nc; (5)
p22 =nnc;nc
nnc;c + nnc;nc; (6)
where nij is the number of times we observe regime i followed by regime j for i; j 2 fc; ncg :Estimation of the remaining parameters of the two states is done by (quasi) conditional maximum
11
likelihood. The regime-speci�c log-likelihood functions, omitting the constant, is
lc (dc; �c) = �Pt 1 fst = cg
2log jj � 1
2
Xt
trace��1"st;t1 fst = cg "0st;t
�;
lnc (dnc; �nc) = �Pt 1 fst = ncg
2log �2 � 1
2
Xt
���2"st;t1 fst = ncg "0st;t
�;
where 1fAg is the indicator function of the event A. The full-sample log-likelihood function is givenby (assuming independence between "c;t and "nc;t)
l (dc; dnc; �) = �T
2log (2�) + lc (dc; �c) + lnc (dnc; �nc) : (7)
When using a numerical optimization algorithm to maximize the log-likelihood function, concern
must be given to the selection of starting values. The reason for concern is that the log-likelihood
function is not globally concave and hence the results of the selected numerical optimization algorithm
may depend on the choice of starting values. In our case we have used the fractional integration
estimates from Haldrup & Nielsen (2006b) as our starting values. For the remaining parameters, i.e.
autoregressive and variance-covariance terms etc., we �nd starting values by letting the fractional
integration parameters be �xed at their initial values and maximizing the log-likehood with respect
to the remaining parameters.
Finally, we remark that our model framework assumes that states are observable and that the
cointegrating vector in the congestion state, � = (1;�1), is given. Therefore, asymptotic distributiontheory for the remaining parameters will be standard under suitable regularity conditions on the errors
"st;t, such as serial independence and moment conditions. In particular, Gaussianity of the errors is not
a necessary condition for the asymptotic distribution theory, but is used only to derive the likelihood
function.
5 Empirical results
Prior to estimation, each log price series had deterministic seasonality removed by regression on a
constant, a time trend, dummy variables for hour-of-day, day-of-week, month-of-year, and a holiday
dummy. The parameter estimates for the constant, trend, and dummy variables are allowed to di¤er
across states. For computational reasons we have selected to set k = 4 to capture the within-the-
day e¤ects and also include a 24th lag, to capture the daily stochastic seasonality. The gain from
introducing more lags and/or e.g. a weekly lag instead of a daily, was not signi�cant enough in terms
of whiteness of the residuals to compensate for the considerable estimation time.
5.1 Estimation of transition dynamics
Since the states are observable, as discussed earlier, estimates of the transition probabilities for each
state are easily calculated and are reported in Table 1. It is clear that some grid points are more
subject to congestion than others. This fact may be explained by demand and supply �uctuations,
but there is also the possibility that congestion may be caused by exploitation of market power.
12
Table 1: Estimated transition probabilities (mean duration of states)
EDK-SWE WDK-SWEcongestion non-congestion congestion non-congestion
congestion 0:7848 (4:65) 0:2152 congestion 0:8216 (5:60) 0:1784non-congestion 0:0131 0:9869 (76:57) non-congestion 0:1259 0:8740 (7:94)
WDK-SNO SNO-SWEcongestion non-congestion congestion non-congestion
congestion 0:9247 (13:28) 0:0753 congestion 0:9478 (19:16) 0:0523non-congestion 0:1221 0:8779 (8:19) non-congestion 0:0462 0:9538 (21:64)
SWE-FINcongestion non-congestion
congestion 0:8505(6:51) 0:1495non-congestion 0:0210 0:9790(48:78)
Table 2: Estimates for the EDK-SWE linkSwitching
No switching Non-congestion CongestionModel d1 d2 dnc1 dnc2 nc dc1 dc2 c
Univariate 0:43(0:012)
0:43(0:012)
0:05(0:018)
0:46(0:012)
0:46(0:011)
0 0:03(0:013)
0:03(0:012)
�0:26(0:077)
VAR estimates 0:45(0:011)
0:49(0:018)
0:21(0:019)
0:32(0:011)
0 0:09(0:021)
0:10(0:038)
0:04(0:04)
VAR estimatesRestricted d1=d2
0:49(0:009)
0:21(0:047)
0:32(0:013)
0 0:09(0:040)
0:00(0:049)
Notes: Subscripts denote the geographical region and superscripts denote the state. Standard errorsare given in parentheses.
The estimated transition probabilities indicate a high degree of persistence in the states. The
probability of staying in the congestion regime, p11, is highest for the grid point SNO-SWE, 0:9478,
whereas it is lowest for EDK-SWE link, 0:7848. This corresponds to a mean duration of 19:16 and
4:65 hours, respectively. In general, the probability of staying in the non-congestion regime, p22, is
higher, estimated at 0:8740� 0:9870, corresponding to mean duration of 7:94� 76:57 hours.
5.2 Estimation of fractional integration and cointegration parameters
In Tables 2-6 we present the estimates of the fractional integration d for a number of di¤erent cases.
The models estimated under the heading �No switching�use pooled data, i.e. there is no separation
of data connected with congestion and non-congestion periods. The estimates of d1, d2 refer to the
fractional orders estimated for the �rst and second region, respectively, whereas the estimate is
the fractional integration order of the log relative price. The results presented under the heading
�Switching� refer to similar estimates when data is partitioned into congestion and non-congestion
periods, where we use superscripts c or nc to denote estimates under the congestion and non-congestion
regimes, respectively. Note that by de�nition nc = 0 in the non-congestion state because the single
13
Table 3: Estimates for the WDK-SWE linkSwitching
No switching Non-congestion CongestionModel d1 d2 dnc1 dnc2 nc dc1 dc2 c
Univariate 0:31(0:015)
0:42(0:011)
0:27(0:017)
0:38(0:024)
0:33(0:013)
0 0:28(0:021)
0:46(0:014)
0:37(0:015)
VAR estimates 0:31(0:010)
0:54(0:020)
0:51(0:025)
0:19(0:021)
0 0:12(0:020)
0:39(0:012)
0:23(0:012)
VAR estimatesRestricted d1=d2
0:56(0:011)
0:53(0:072)
0:25(0:018)
0 0:33(0:033)
0:11(0:029)
Notes: Subscripts denote the geographical region and superscripts denote the state. Standard errorsare given in parentheses.
price series are identical and hence the series are fractionally cointegrated in an extreme form. Results
are reported using three di¤erent models. For comparison, �Univariate� reproduces the estimates
reported in Haldrup & Nielsen (2006b), i.e. this corresponds to estimates using the model (1) for
both the regime switching and non-regime switching cases. The row named �VAR estimates�displays
estimates based on the model (3)-(4). Note that, as opposed to the univariate estimates, dnc1 = dnc2by construction since the price series follow the same process in these cases. Finally, VAR estimates
are reported where we restrict d1 = d2 in the non-switching case and dc1 = dc2 in the congestion state
under the regime switching case.
Consider �rst the East Denmark-Sweden connection exhibited in Table 2, and consider initially
the pooled data set without regime switching. The estimates of d for the two regions are rather similar
regardless of the underlying model being estimated, i.e. estimates are in the range 0:43 � 0:49 andhence on the borderline of the stationary region. The estimates of are somewhat lower: 0:05 when the
univariate model is used for estimation and 0:21 when the VAR model is used. These results indicate
that when data is not classi�ed according to regimes, then there is evidence of fractional cointegration
amongst the series. Now, the question is whether this result is caused by the non-congestion state
dominating the sample or whether both regimes contribute to the cointegration �nding. In the regime
switching case, the non-congestion estimates clearly indicate cointegration (as expected) with estimates
of d in the range 0:32�0:46. In the congestion case, the memory parameter for each of the price seriesare similar but somewhat lower, i.e. 0:09� 0:10. Also, there is indication of a weak form of fractional
cointegration in the congestion state since the relative price is FI(0:04). When we restrict d1 = d2
over the di¤erent scenarios, we see the same story as not restricting the parameters.
Next, we turn to the West Denmark-Sweden link in Table 3. For the model without regime
switching both restricted and unrestriced parameter estimates using the VAR model indicates no
presence of fractional cointegration which is similar to what is found in the univariate case. Under
regime switching there is clearly cointegration in the non-congestion state, however, for the model with
unrestricted integration orders there is no cointegration in the congestion state. The results from the
no switching models are thus some combination of their regime switching counterparts, and it is clear
that by not taking regime switching into account we falsely conclude that there is no sign of fractional
cointegration, whereas it is evident that fractional cointegration is present in the non-congestion state.
When we restrict d1 = d2 the VAR model in fact shows cointegration also in the congestion state.
The West Denmark-South Norway link with estimates in Table 4 is an interesting case where
14
Table 4: Estimates for the WDK-SNO linkSwitching
No switching Non-congestion CongestionModel d1 d2 dnc1 dnc2 nc dc1 dc2 c
Univariate 0:30(0:015)
0:44(0:011)
0:28(0:016)
0:30(0:026)
0:16(0:008)
0 0:31(0:017)
0:63(0:017)
0:37(0:015)
VAR estimates 0:30(0:009)
0:57(0:018)
0:91(0:019)
0:43(0:016)
0 0:20(0:017)
0:22(0:013)
- 0:07(0:015)
VAR estimatesRestricted d1=d2
0:92(0:012)
1:13(0:018)
0:37(0:021)
0 0:34(0:047)
0:10(0:032)
Notes: Subscripts denote the geographical region and superscripts denote the state. Standard errorsare given in parentheses.
Table 5: Estimates for the SNO-SWE linkSwitching
No switching Non-congestion CongestionModel d1 d2 dnc1 dnc2 nc dc1 dc2 c
Univariate 0:45(0:011)
0:41(0:012)
0:31(0:016)
0:38(0:008)
0:41(0:012)
0 0:32(0:013)
0:21(0:013)
0:39(0:018)
VAR estimates 0:60(0:016)
0:59(0:017)
0:06(0:010)
0:49(0:018)
0 0:32(0:007)
0:18(0:007)
0:31(0:013)
VAR estimatesRestricted d1=d2
0:46(0:007)
0:21(0:010)
0:39(0:012)
0 0:28(0:007)
0:25(0:015)
Notes: Subscripts denote the geographical region and superscripts denote the state. Standard errorsare given in parentheses.
there seems to be no fractional cointegration in the non-switching models. However, looking at the
VAR models where we condition on congestion/non-congestion we see that there is in fact fractional
cointegration in both states. That is, an extreme form in the non-congestion state by de�nition and
in the congestion state because dc1 � dc2 (or dc1 = dc2) and we have a reduction of fractional order forthe relative price series ( c). In the univariate model there is no sign of fractional cointegration.
As seen from Table 5 the link between South Norway and Sweden indicates fractional cointegration
in the model without regime switching. However, when conditioning on states, it is seen that it is
only in the non-congestion state that cointegration takes place. In the congestion state the fractional
orders of the single price series and the relative price series are almost identical for all three models.
Finally, for the Sweden-Finland link in Table 6 there is some evidence of fractional cointegration
Table 6: Estimates for the SWE-FIN linkSwitching
No switching Non-congestion CongestionModel d1 d2 dnc1 dnc2 nc dc1 dc2 c
Univariate 0:39(0:012)
0:38(0:012)
0:24(0:017)
0:42(0:011)
0:43(0:012)
0 �0:02(0:012)
�0:02(0:005)
0:48(0:022)
VAR estimates 0:52(0:009)
0:60(0:015)
0:34(0:017)
0:31(0:014)
0 0:02(0:010)
0:02(0:009)
0:01(0:012)
VAR estimatesRestricted d1=d2
0:49(0:009)
�0:07(0:037)
0:31(0:013)
0 0:02(0:011)
0:01(0:013)
Notes: Subscripts denote the geographical region and superscripts denote the state. Standard errorsare given in parentheses.
15
Table 7: Estimated adjustment coe¢ cients
No switching Switchingd1 6= d2 d1 = d2 d1 6= d2 d1 = d2
Series �1 �2 �1 �2 �1 �2 �1 �2EDK-SWE 0:1775�� �0:3526�� �0:3495�� �0:0263 0:1592�� 0:3130�� 0:1987�� 0:2587��
(0:0169) (0:0172) (0:1054) (0:0218) (0:0331) (0:0443) (0:0397) (0:0538)WDK-SWE 0:0328 0:0576 �0:1239 �0:0647�� 0:4798�� 0:0284 0:0033 0:0059
(0:0943) (0:0678) (0:1244) (0:0321) (0:0374) (0:025) (0:0891) (0:0503)WDK-SNO 0:9253�� �0:0220 0:0173 �0:0200 0:0429�� �0:0048 0:1247�� �0:0784��
(0:0219) (0:0243) (0:0271) (0:0212) (0:0161) (0:0143) (0:0246) (0:0171)SNO-SWE �0:0433 �0:0017 0:8682�� 0:0376 �1:039�� 0:3130�� �0:9871�� 0:5623��
(0:0263) (0:0448) (0:0522) (0:0224) (0:1231) (0:1043) (0:1126) (0:1526)SWE-FIN 0:3106�� 0:9949�� �0:0154� 0:0862�� 0:7152�� �0:2694�� 0:6991�� �0:3479��
(0:0527) (0:0842) (0:0090) (0:0177) (0:0210) (0:0313) (0:0304) (0:0154)
Notes: Subscripts denote the geographical region. Numbers in bold face refer to situations withindication of fractional cointegration based on the d1; d2, and estimates reported in Tables 2-6.Standard errors are given in parentheses. One and two asterisks denote signi�cance at the 10% and5% levels, respectively.
in the non-switching models. For the univariate model, the regime switching results do not make
much sense because c > maxfdc1; dc2g. The two regime switching VAR models (with and without therestriction d1 = d2) give identical results in the regime switching case. There is cointegration the non-
congestion state whereas all series seem to be I(0) in the congestion state. Hence, the non-congestion
state seems to dominate the data when there is no conditioning on state.
5.3 Estimation of adjustment coe¢ cients
By modeling the data using the multivariate switching VAR model (3)-(4) we obtain estimates of
the adjustment coe¢ cients in the congestion state which is not possible when estimating univariate
models. The adjustment coe¢ cients indicate (ceteris paribus) whether a speci�c geographical price
region is moving towards or away from equilibrium in response to a particular price gap. An alter-
native interpretation of the adjustment coe¢ cients follows from the market setup and varying costs
of electricity production in di¤erent geographical regions, i.e. if an inexpensive electricity supply
from another geographical region is suddenly stopped due to a congestion, prices are expected to be
higher until non-congestion is restored which may result in adjustment parameters indicating a move
away from equilibrium. Parameter interpretation is of course an issue here, because we force the
cointegrating vector to be (1;�1) and the parameter estimates �1; �2, and do not have the usualinterpretation in the congestion state if in fact there is no cointegration present in that state (due to
lack of identi�cation): Therefore, if cointegration is not present the interpretation of (�1; �2)0 should
be made with caution.
In Table 7 the adjustment coe¢ cients (�1; �2) associated with the VAR models are reported,
both with restricted and unrestricted d parameters and for the switching and non-switching cases.
Numbers in boldface font indicate situations where, based upon the d1; d2; and estimates, some
degree of fractional cointegration is likely to take place. In the regime switching models, boldface
indicates situations where there appears to be cointegration in the congestion state.
Consider �rst the East Denmark-Sweden connection. When we do not condition on regime switch-
16
ing and d1 6= d2, neither East Denmark nor Sweden appear to correct towards equilibrium. On
the other hand, when d1 = d2 is enforced, East Denmark moves towards equilibrium whereas Swe-
den�s adjustment coe¢ cient is insigni�cant. When we condition on regime switching, East Denmark
moves away from equilibrium, whereas Sweden now moves towards equilibrium. This would appear
to contradict error correction adjustment. However, there may be other reasons for these seemingly
contradictory results. For example, if there is no congestion between EDK and SNO prices are identi-
cal and electricity �ows from the cheaper area (usually SWE because of the hydropower and nuclear
electricity production) to the more expensive area (usually EDK because of the majority of electricity
production stemming from thermal plants), see Table 1 in Haldrup & Nielsen (2006a). Therefore,
when congestion occurs prices in East Denmark will usually be higher re�ecting the higher marginal
cost of electricity production in East Denmark compared to Sweden. This increase in price in East
Denmark corresponds to �1 > 0 in the EDK-SWE bivariate model, i.e. a move away from equilibrium.
Importantly, this is not due to system instability but rather due to electricity being more expensive
to produce in East Denmark compared to Sweden.
Next, we look at the West-Denmark-Sweden link. Only the case with d1 = d2 for the switching
model makes sense in this case, i.e. this is the only situation where some degree of cointegration was
found. However, since both adjustment parameters are small and insigni�cant the power of the error
correction mechanism should be questioned in this case.
Looking at the West Denmark-South Norway connection we found no immidiate sign of cointegra-
tion in the non-switching model, see Table 4. When we condition on regimes there is cointegration, and
we see that both areas appear to move away from equilibrium. An explanation for this feature follows
again from the setup of the market and the marginal costs of electricity production. When congestion
occurs prices in West Denmark will be higher re�ecting the higher costs of electricity production. If
demand continues to increase in West Denmark during the congestion more expensive generators will
be taken into use thus increasing marginal cost of production even further. This increase in price
in West Denmark corresponds to �1 > 0 in the WDK-SNO bivariate model, i.e. a move away from
equilibrium. Again, this is not due to system instability but rather due to electricity being more
expensive to produce in West Denmark compared to South Norway.
The South Norway-Sweden and Sweden-Finland cases are similar in the sense that no cointegration
was found in the congestion state. However, in the non-switching model fractional cointegration was
suggested by the data. Enforcing d1 = d2 seems to a¤ect the adjustment mechanisms rather radically,
which we attribute to the lack of conditioning on states.
To sum up, appropriate modeling of the regime switching feature is seen to have a major impact on
the dynamic price adjustment mechanism. In addition to giving estimates of the adjustment process
speci�c to the particular state, conditioning on congestion/non-congestion allows interpretation of the
adjustment coe¢ cients in terms of the prices in each geographic region under the congestion regime
and not necessarily in terms of the stability of the system.
17
6 Conclusion
In this paper we have proposed a multivariate extension of the univariate framework of Haldrup
& Nielsen (2006b). This extension enables us to describe the dynamic structure of congestion and
non-congestion of electricity prices within the Nord Pool area. The notions of congestion and non-
congestion are motivated by the organization of the Nord Pool market, which is characterized by
physical exchanges of power across geographical regions. When the actual transmission of electricity is
constrained by the �ow capacity, congestion occurs. Therefore, the presence or absence of transmission
bottlenecks may have implications for the way prices are formed. Our multivariate modeling framework
allows us to explicitly take into account the fact that, in non-congestion periods, prices are the same
across geographical regions are therefore also subject to the same price shocks. This, in particular, is
not possible in the univariate frameworks in previous studies.
From our empirical analysis it is clear that conditioning on states, i.e. congestion vs. non-
congestion has a major impact on the implications for the dynamics of the electricity prices. That is,
when not conditioning on the speci�c states, misleading conclusions in regards to potential fractional
cointegration and the adjustment to equilibrium may be drawn.
There are three possible types of misclassi�cation of the model dynamics in the empirical analysis.
That is, (1) non-switching models may indicate that the price series are fractionally cointegrated,
whereas when conditioning on states this is only the case in the non-congestion state (which is cointe-
grated by de�nition); (2) the non-switching model could indicate that there is no fractional cointegra-
tion when in fact there is cointegration in the non-congestion state; and (3) there is the possibility of
fractional cointegration in both regimes, but not in the non-switching model. A feature of our model
that is particular to its multivariate nature is that we are able to estimate adjustment coe¢ cients in
the error correction representation. Again, it is important to condition on congestion/non-congestion,
since we may otherwise draw false conclusion about the adjustment to equilibrium (non-congestion).
Some geographical regions are indirectly connected, e.g. West Denmark and East Denmark are
indirectly connected through Sweden, so there are regimes where West Denmark and East Denmark
constitute the same price area. The e¤ects of these indirect links between geographical regions and
how they potentially a¤ect fractional cointegration and the adjustment in the system are therefore of
major interest. A detailed analysis which includes indirect links is conceptually straightforward using
a higher-dimensional model, but computionally infeasible and therefore left for future research.
References
Amundsen, E. S. & Bergman, L. (2007), �Integration of multiple national markets for electricity: The
case of norway and sweden�, Energy Policy 35, 3383�3394.
Davidson, J. (2002), �A model of fractional cointegration, and tests for cointegration using the boot-
strap�, Journal of Econometrics 110, 187�212.
Davidson, J., Peel, D. & Byers, D. (2005), The long memory model of political support: some further
results, Working Papers 003057, Lancaster University Management School, Economics Depart-
ment.
18
Diebold, F. X. & Inoue, A. (2001), �Long memory and regime switching�, Journal of Econometrics
105, 131�159.
Ding, Z. & Granger, C. W. J. (1996), �Modeling volatility persistence of speculative returns: a new
approach�, Journal of Econometrics 73, 185�215.
Engle, R. F. & Granger, C. W. J. (1987), �Modeling the persistence iof conditional variances�, Econo-
metric Reviews 5, 1�50.
Escribano, Á., Peña, J. I. & Villaplana, P. (2002), Modeling electricity prices: International evidence,
Economics Working Papers we022708, Universidad Carlos III, Departamento de Economía.
Fabra, N. & Toro, J. (2002), Price wars and collusion in the spanish electricity market, Industrial
Organization 0212001, EconWPA.
Granger, C. W. J. (1980), �Long memory relationships and the aggregation of dynamic models�, Journal
of Econometrics 14, 227�238.
Granger, C. W. J. (1981), �Some properties of time series data and their use in econometric model
speci�cation�, Journal of Econometrics 16, 121�130.
Granger, C. W. J. (1986), �Developments in the study of cointegrated economic variables�, Oxford
Bulletin of Economics and Statistics 48, 213�228.
Granger, C. W. J. & Hyung, N. (2004), �Occasional structural breaks and long memory with an
application to the s&p 500 absolute stock returns�, Journal of Empirical Finance 11, 399�421.
Granger, C. W. J. & Joyeux, R. (1980), �An introduction to long-memory time series models and
fractional di¤erencing�, Journal of Time Series Analysis 1, 15�29.
Haldrup, N. & Nielsen, M. Ø. (2006a), �Directional congestion and regime switching in a long memory
model for electricity prices�, Studies in Nonlinear Dynamics & Econometrics 10, 1�24.
Haldrup, N. & Nielsen, M. Ø. (2006b), �A regime switching long memory model for electricity prices�,
Journal of Econometrics 135, 349�376.
Hamilton, J. D. (1989), �A new approach to the economic analysis of nonstationary time series and
the business cycle�, Econometrica 57, 357�84.
Hosking, J. R. M. (1981), �Fractional di¤erencing�, Biometrika 68, 165�176.
Johansen, S. (2008), �A representation theory for a class of vector autoregressive models for fractional
processes�, Econometric Theory 24, 651�676.
Koopman, S. J., Ooms, M. & Carnero, M. A. (2007), �Periodic seasonal reg-ar�magarch models for
daily electricity spot prices�, Journal of the American Statistical Association 102, 16�27.
Motta, M. (2004), Competition policy, theory and practise, Cambridge University Press, Cambridge.
19
NordPool (2003a), �Derivatives trade at nord pool�s �nancial market�, Working Paper,
www.nordpool.no .
NordPool (2003b), �The nordic power market, electricity power exchange across national borders�,
Working Paper, www.nordpool.no .
NordPool (2003c), �The nordic spot market, the worlds �rst international spot power exchange�,Work-
ing Paper, www.nordpool.no .
Robinson, P. M. & Yajima, Y. (2002), �Determination of cointegrating rank in fractional systems�,
Journal of Econometrics 106, 217�241.
Sherman, P. (1989), The regulation of monopoly, Cambridge University Press, Cambridge.
20
7 Appendix: Figures
Figure 1: Map of the Nord Pool area.
21
4800 9600 14400 19200 24000 28800
4
6
8 East Denmark
4800 9600 14400 19200 24000 28800
0
5
10 West Denmark
4800 9600 14400 19200 24000 28800
4
6
South Norway
4800 9600 14400 19200 24000 28800
5.0
7.5Sweden
4800 9600 14400 19200 24000 28800
5.0
7.5Finland
Figure 2: Hourly log spot electricity prices for the Nord Pool area covering the period 3 January 2000to 25 October 2003.
3 4 5 6 7
4
6
8 East Denmark Sweden
3 4 5 6 7 8
0
5
10 West Denmark Sweden
3 4 5 6 7
0
5
10 West Denmark South Norway
3 4 5 6 7 8
4
6
South Norway Sweden
3 4 5 6 7 8
5.0
7.5
Sweden Finland
Figure 3: Scatter plots of hourly log prices across Nord Pool regions.
22
Chapter Chapter Chapter Chapter 2222
Local polynomial Whittle estimation covering non-stationary fractional processes
Local polynomial Whittle estimation covering non-stationary
fractional processes�
Frank S. Nielseny
Aarhus University and CREATES
March 18, 2009
Abstract
This paper extends the local polynomial Whittle estimator of Andrews & Sun (2004) to frac-
tionally integrated processes covering both stationary and non-stationary regions. We utilize the
notion of the extended discrete Fourier transform and periodogram to extend the local polynomial
Whittle estimator to the non-stationary region. We further, approximate the short-run component
of the spectrum by a polynomial instead of a constant in a shrinking neighborhood of zero, and
thereby alleviate some of the bias that the local Whittle estimator is prone to. This bias reduction
comes at a cost as the variance is in�ated by a multiplicative constant. We show consistency and
asymptotic normality for d 2 (�1=2;1), and if the spectral density of the short-run component isin�nitely smooth near frequency zero, we obtain a rate of convergence arbitrarily close to the para-
metric rate. A simulation study illustrates the performance of the proposed estimator compared
to the classical local Whittle estimator and the local polynomial Whittle estimator. The empirical
justi�cation of the proposed estimator is shown through an analysis of credit spreads.
Keywords: Bias reduction, fractional integration, local polynomial, local Whittle estimation,
long memory.
JEL Classi�cation: C22
�I am grateful to Niels Haldrup, Javier Hualde, and Morten Ørregaard Nielsen for valuable suggestions and comments.
This work was partly done while I was visiting Cornell University, their hospitality is gratefully acknowledged. I would
also like to thank Thomas Stephansen for proofreading the manuscript. I greatly acknowledge �nancial support from the
Danish Social Sciences Researh Council (grant no. FSE275-05-0199) and Center for Research in Econometric Analysis
of Time Series (CREATES), funded by the Danish National Research Foundation.yCREATES, School of Economics and Management, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark.
email: [email protected].
27
1 Introduction
We are interested in semiparametric frequency-domain estimation based on the local approximation
f(�) � '(�)��2d as �! 0+; (1)
where '(0) 2 (0;1) and the symbol ���means that the ratio of the left and right hand sides tendsto one in the limit. '(�) is an even, positive, continuous function on [��; �) which can be thoughtof as the spectral density of the short-memory component of the series of interest. Semiparametric
based estimators have been popular for a long time as it is believed that the loss of e¢ ciency with
respect to the parametric estimators entailed by the local speci�cations may be o¤set by a possible
greater robustness. This robustness stems from avoiding the inconsistency in estimating the long-run
dynamics that may be caused by a misspeci�cation of short-run dynamics.
Under stationarity and modeling '(�) in (1) by a constant G 2 (0;1), a common semiparametricestimator is the local Whittle (LW) estimator proposed by Künsch (1987). Robinson (1995a) shows
its consistency and asymptotic normality for d 2 (�1=2; 1=2). Velasco (1999a) extended Robinson�s(1995a) results to show that the estimator is consistent for d 2 (�1=2; 1) and asymptotically normallydistributed for d 2 (�1=2; 3=4) ; given that the fractional process is of Type I, see Marinucci &Robinson (1999) and Robinson (2005). Phillips & Shimotsu (2004) show that the LW estimator is
consistent for d 2 (1=2; 1] and has a nonnormal limit distribution for d 2 (3=4; 1), and a mixed normallimit distribution for d = 1. When d > 1 the LW estimator converges to unity in probability and
therefore is inconsistent, given that the fractional process is of Type II, Phillips & Shimotsu (2004).
This convergence in probability to unity when d > 1 also holds for log periodogram estimators as
shown in simulations studies by Hurvich & Ray (1995) and Velasco (1999b), and theoretically by Kim
& Phillips (2006). That is, in general the LW (or log periodogram) estimator is not a good general
purpose estimator when d takes on values in the non-stationary region beyond 3=4: The asymptotic
theory is discontinuous at d 2 f3=4; 1g and the estimator is not consistent for d > 1. Several methodsare available to avoid the problems when entering the non-stationary region. A simple one is to �rst
di¤erence the series before using the semiparametric estimator and then add one to the estimate. This
method runs into problems if the series of interest is trend stationary, Shimotsu & Phillips (2005) and
Shimotsu (2006). Tapering the data is another method often implemented and suggested, see Velasco
(1999a) and Hurvich & Chen (2000).
Shimotsu & Phillips (2005) introduce what they call an exact local Whittle estimator1 which is
consistent and has the same N(0; 1=4) limit distribution for all values of d if the I (d) series is generated
by a linear sequence and the range of the estimator is not wider than 9=2:2 Instead of using fractional
di¤erencing of the data, Abadir, Distaso & Giraitis (2007) use a di¤erent approach �rst noted by
Phillips (1999). They extend the discrete Fourier transform to the non-stationary case and use this in
whitening of the periodogram. Abadir et al. (2007) show that when the I (d) series is generated by a
1Shimotsu (2006) extends this to a feasible exact local Whittle estimator when introducing an unknown mean and
trend.2The assumption concerning the width of the admissible parameter space is needed to ensure that the di¤erence in
the criteria function is uniformly bounded away from zero, see Shimotsu & Phillips (2005).
28
linear sequence the extended discrete Fourier transform and periodogram have the same asymptotic
behavior for d 2 (�3=2;1).Our main interest in this paper is to analyze a general purpose estimator where the limiting
distribution holds in the non-stationary case and when there is short-run contamination. To achieve
bias reduction when there is contamination by short-run dynamics, we follow Andrews & Sun (2004)
and model the spectral density of the short-memory component '(�) as a �nite and even polynomial
instead of a constant near frequency zero. In extending the local polynomial Whittle (LPW) estimator
of Andrews & Sun (2004) to the non-stationary region, we use the notion of the extended discrete
Fourier transform and periodogram as in Abadir et al. (2007). We call the new estimator the extended
local polynomial Whittle (ExtLPW) estimator. In establishing consistency and asymptotic normality
for the estimator d we follow the method set out by Andrews & Sun (2004). Given that the generating
process is linear, the same central limit theorem argument as in the stationary case jdj < 12 derived by
Robinson (1995a) holds; although, not for d0 =�12 ;32 ; :::
. We establish consistency and asymptotic
normality for d0 2 (�1=2;1) : Furthermore, if '(�) is in�nitely smooth near frequency zero, the rateof convergence can become arbitrary close to the parametric rate. The simulations reveal that our
proposed estimator is superior when considering possible short-run contamination and non-stationary
values of d: We also include an analysis of credit spreads that demonstrates the usefulness of the
estimator.
The remainder of the paper is structured as follows: Section 2 gives a short introduction to the
LPW estimator of Andrews & Sun (2004). Section 3 expands the usual stationary framework to the
non-stationary framework thereby de�ning the ExtLPW estimator. Section 4 states the assumptions
needed for showing consistency and asymptotic normality. Section 5 introduces the theorem for
consistency and asymptotic normality. Section 6 presents the results from a small simulation study.
Section 7 provides an empirical investigation of potential long memory properties of treasury yield
and yields on corporate bonds, spreads over treasury and spreads between corporate yields. Section 8
concludes. Lemmas, and proofs to Theorem 1 and Lemma 1-2 are situated in the Appendix.
2 The local polynomial Whittle approach
De�ne at the jth frequency �j =2�jn for 1 � j � m; the discrete Fourier transform (DFT) and
periodogram of Xt as
w (�j) =1p(2�n)
nXt=1
Xt exp (it�j) (2)
I (�j) = j!(�j)j2 : (3)
Following Andrews & Sun (2004) the (negative) local polynomial Whittle log-likelihood is
Un(d;G; �) =1
m
mXj=1
"log(G��2dj exp (�Pr (�j ; �))) +
I (�j)
G��2dj exp (�Pr (�j ; �))
#; (4)
where Pr (�j ; �) =Pr�=1 ���
2�j and de�nes the closed interval of admissible estimates to be D =
[�1;�2] � [�1=2; 1=2] and m = o(n) is the bandwidth choice, i.e. the number of periodogram
29
ordinates to be used in the estimation. Then concentrating Un(d;G; �) with respect to G we can write
the likelihood function as
Ln (d; �) = log G(d; �)� 1
m
mXj=1
Pr (d; �)�2d
m
mXj=1
log �j + 1; (5)
G(d; �) =1
m
mXj=1
I(�j) exp (Pr (�j ; �))�2dj : (6)
Thus, Andrews & Sun (2004) propose to minimize (5) over the admissible set (d; �) 2 D ��
(dAS ; �AS) = argmin(d;�)2D��
Ln (d; �) ; (7)
where � is compact and convex set in Rr: As shown by Andrews & Sun (2004) the asymptotic varianceof dAS is in�ated by a multiplicative constant.
It should be noted that it is not necessary to correct for an unknown mean of fXtg as we onlycompute the DFT at the frequencies �j =
2�jn for j = 1; :::;m where m = o (n), rendering the log-
likelihood local to frequency zero. This general result only holds for stationary values of d: Assuming
an unknown mean of the generating process when we are in the non-stationary region is the same as
saying that the data generating process is free of linear trends, in the usual setup of e.g. Robinson
(1995a) and Andrews & Sun (2004).
The di¤erence between the objective function de�ned in Robinson (1995a) and Andrews & Sun
(2004) is how we approximate '(�) as �! 0 by logG� Pr(�; �) where the polynomial term Pr(�; �)
vanishes for � = 0:
Given Assumptions 1-4 in Andrews & Sun (2004) and utilizing their Lemma 1 and Lemma 2, the
estimates (dAS ; �AS) are equal to the solution to the �rst-order conditions with probability that goes
to one as n ! 1. This solution is consistent and asymptotically normal, Andrews & Sun (2004, pp.
572). The asymptotic bias is of order O(�minfs;2+2rg), where s is measure of the smoothness of the
spectral density near frequency zero, for the LPW estimator and O(�2) for the LW estimator. That
the asymptotic bias of the LPW estimator is of order O(�minfs;2+2rg) follows from Assumption 4 and
it is clearly seen that if r = 0 and by Assumption 2 that s > 2r the asymptotic bias reduces to that
of the classical LW estimator, i.e. O(�2): In this paper, we only consider long memory processes with
potential short-run contamination, but if fXtg is a perturbed fractional process, the orders will besmaller and dependent on d. Nonetheless, the LPW estimator will still be consistent at the expense
of lower convergence rate and higher asymptotic bias, see Arteche (2004) for the LW case. In the
perturbed case, the asymptotic bias of not modeling the spectral density appropriately will be of
order O(�2d), Hurvich & Ray (2003), Arteche (2004) and Hurvich, Moulines & Soulier (2005).
3 The extended local polynomial Whittle estimator
We de�ne a fractional integrated process as one that is stationary or exhibits some weak dependence
after the application of the fractional �lter, (1 � L)d. We often distinguish between two ways ofexpressing a fractional process as a function of weakly dependent innovations, i.e. Type I and Type
II processes, see Marinucci & Robinson (1999). As we want to stay in the framework of Abadir et al.
30
(2007) and Andrews & Sun (2004), we work in the setting of de�ning the fractional process as a Type
I process. Because we are not only interested in the stationary region, it is not enough just to expand
the �lter (1� L)d and express it as an in�nite order moving average of the innovations which resultsin a stationary process when d < 1=2:When we move into the non-stationary region, i.e. d � 1=2; thisprocedure breaks down because the in�nite order moving average of the innovations does not converge.
This is circumvented by modeling the process as the partial sum of the component I(d�p) process forsome p 2 Z and expanding (1�L)p�d in terms of the innovations. This results in a stationary integerdi¤erenced series. The disadvantage is that it introduces discontinuities at d = 1=2; 3=2; ::p � 1=2,where p 2 Z. The Type II process of fractional integration is designed to cover a wider range of d andthereby circumvent some of the problems concerning the Type I process, see Robinson (1994), Phillips
(1999), Tanaka (1999), Marinucci & Robinson (1999), and Robinson (2005). For the derivation of the
ExtLPW estimator, we de�ne the fractional process as a Type I process. More speci�cally, we de�ne
the I(d) process as in Abadir & Taylor (1999)
De�nition 1 For d = p+du; where p 2 Z and du 2 (�1=2; 1=2) ; we say that fXtg is an I(d) process,i.e. Xt � I(d); if
(1� L)pXt = ut; t = 1� p; 2� p; :::; (8)
where futg is a second order stationary sequence with spectral density
fu(�) = G0 j�j�2du + o�j�j�2du
�; (9)
as �! 0; where G0 2 (0;1) :
De�ne the extended DFT and the extended periodogram of a time series fXtg evaluated at theFourier frequencies �j =
2�jn ; where j = 1; :::; n; by
w(�j ; d) = wx(�j) + c(�j ; d); (10)
I(�j ; d) = jw(�j ; d)j2 ; (11)
where wx(�j) is the usual DFT de�ned as
wx(�j) =1p(2�n)
nXt=1
Xt exp (it�j) ; (12)
and the correction term c(�j ; d) takes on constant values on the intervals d 2 Dp := [p� 1=2; p+1=2);p 2 N and is de�ned by
c(�j ; d) =
(0 if d 2 D0 = [�1=2; 1=2)exp(i�j)
Pp`=1(1� exp(i�j))�`Z` if d 2 Dp for p = 1; 2; :::;
(13)
where
Z0 = wx(0) =1p(2�n)
nXt=1
Xt; (14)
Z` =1p(2�n)
n(1� L)`�1Xn � (1� L)`�1X0
o; ` = 1; 2; :::; p: (15)
31
In the computation of the step function c(�j ; d), we have to enumerate the data depending on what
subspace of D = [d1; d2] we are interested in. This is apparent from looking at (15), for example
when p = 2: That is, X�i+1; X�i+2; :::; Xn where i = (0 _ bd2 � 1=2c) : The usual DFT, (12) is alwayscomputed using the enumeration fXtgnt=1 :
This notion of the extended DFT allows us to estimate the usual LPW estimator in the context
of non-stationary values for d by minimizing the criteria function de�ned as (5) over the admissible
parameter space. The extension of the DFT to the non-stationary case is based on the work of Phillips
(1999), Lahiri (2003), Dalla, Giraitis & Hidalgo (2006) and Abadir et al. (2007). De�ne the pseudo
spectral density of the sequence fXtg � I(d0); where d0 = p0 + du and du 2 (�1=2; 1=2) as
f(�) = j1� exp (i�)j�p0 fu (�) ; j�j � �: (16)
From this de�nition it is clear that
f (�) � G0 j�j�2d0 as �! 0+: (17)
Then following Abadir et al. (2007, Lemma 4.4), De�nition 1, and (10), the extended DFT has the
property that
w(�j ; d0) = (1� exp(i�j))�p0 !u(�j); j = 1; :::; n; (18)
where !u (�j) is the DFT of the stationary sequence futg. From Abadir et al. (2007, Lemma 4.4(i)),
it follows that
wx(�j) = (1� exp(i�j))�p0w�p0x(�j)� exp(i�j)p0Xr=1
(1� exp(i�j))�rw�rx (19)
= (1� exp(i�j))�p0wu(�j)� exp(i�j)p0Xr=1
(1� exp(i�j))�rw�rx; (20)
where the second equality follows from De�nition 1. Then the de�nition in (10) follows trivially.
Denote the rescaled extended DFT by
vj = v (�j ; d0) =w (�j ; d0)
' (�j)1=2 ��d0j
; 1 � j � m: (21)
Given that the generating process is linear, equation (18) and Lemma 2 show that the asymptotic
behavior of the rescaled extended DFT and periodogram is the same for all d0 2 (�1=2;1). Further-more, given consistency, d
p! d0 and the de�nition of the extended DFT, we get
w��j ; d
�p! w (�j ; d0) : (22)
This follows because c(�j ; d) is a step function and therefore constant on the intervals d 2 (p� 1=2; p+ 1=2)for p 2 N: This considerably eases the estimation as we are left with the same estimation procedureas in the stationary case.
If the process is stationary the ExtLPW estimator is identical to the LPW estimator of Andrews
& Sun (2004). Similarly to the estimators in Robinson (1995a), Andrews & Sun (2004), and Abadir
et al. (2007) this estimator is based on the whitening principle of the periodogram. That is, similarly
32
to the stationary case, Robinson (1995a) and Andrews & Sun (2004), the ExtLPW estimator is based
on the behavior of the random variables
�j = � (�j) =Iu(�j)
fu(�j); 1 � j � m: (23)
Then given the spectral density of the second order stationary sequence futg ;(9), the �rst moment isgiven by
E��j�= 1 + o(1) +O(j�1 log j) 8 1 � j � m as n!1: (24)
Additionally, under regularity assumptions, see Lahiri (2003) and Abadir et al. (2007), the random
variables also satisfy
var��j�� C 8 1 � j � m; (25)
where C is a positive �nite constant and
cov��j ; �s
�! 0 for j; s!1 and j 6= s: (26)
In the proof to Lemma 4.6 in Abadir et al. (2007), the above equations are proven.
Then given the equations (24), (25) and (26), the random variables �j satisfy a weak law of large
numbers (WLLN), i.e.1
m
mXj=1
�jp! 1; as n!1: (27)
Given additional assumptions, this result is su¢ cient to ensure consistency of the estimator d. See
further discussions on this later. The WLLN for the random variables �j is equivalent to a WLLN for
the random variables jvj j2 ; i.e.1
m
mXj=1
jvj j2p! 1; as n!1: (28)
Then given the nature of the spectral density (9) and (18)
jvj j2 = �j (1 + o(1)) 8 1 � j � m as n!1: (29)
Furthermore, given equation (24)
Ehjvj j2
i� C 8 1 � j � m: (30)
For a more thorough walkthrough of the extended DFT, see Phillips (1999), Lahiri (2003), Dalla et al.
(2006), and Abadir et al. (2007). We further note that the variables vj and �j are invariant with
respect to the mean of futg :
4 Assumptions
In this section, we introduce the assumptions needed to establish consistency and asymptotic normality
of the proposed estimator.
Assumption 1 D�� is a compact and convex subset of Rr+1 and d0 and �0 lie in the interior ofD = [d1; d2] � [�1=2;1] where d0 6= p0 � 1=2; p0 2 N and �, respectively:
33
Assumption 1 is a combination of similar assumptions given in Andrews & Sun (2004) and Abadir
et al. (2007). Yet, our lower bound is more restrictive than in Abadir et al. (2007), because we need
to restrict E [Xt] = 0 to facilitate d1 = �3=2. Therefore, we only consider invertible processes, i.e.d > �1=2: Furthermore, the assumption restricts the parameters of interest to be in the interiorof a compact and convex set: If d lies on the boundary of the parameter space, we conjecture that
the estimator will be consistent,3 but it may not be asymptotically normal. As noted by Newey &
McFadden (1986, p. 2144), it is su¢ cient that the estimator is in the relative interior of the parameter
space, allowing for equality restrictions to be imposed on the parameters of interest:
Assumption 2 The spectral density of the stationary sequence futg is
fu(�) = '(�) j�j�2du + o�j�j�2du
�as �! 0+; (31)
where '(�) is continuous at � = 0, '(0) 2 (0;1), and du 2 (�1=2; 1=2).
Assumption 2 is a result of using the basic semiparametric setup from De�nition 1.
Assumption 3 Let '(�) be smooth of order s at � = 0; where s > 2r and r 2 Nn f0g ; s � 1: That is,in a neighborhood of � = 0; '(�) is bsc times continuously di¤erentiable with bsc� derivative,
'(bsc); satisfying a Hölder condition of order s � bsc at zero, i.e.��'(bsc) (�)� '(bsc) (0)�� �
C j�js�bsc for a constant C <1:
The assumption imposes a regularity condition on the function '(�) that characterizes the semi-
parametric setup, Andrews & Sun (2004), i.e. ' (�) has a Taylor expansion around � = 0
'(�) =
bs=2cXk=0
�k�2k +O(�s) (32)
= Pr(�; �) +O(�s); as �! 0+; (33)
where �0 = '(0) and
�k = � 1
(2k)!
dk
d�k'(�)
�����=0
: (34)
In general, Assumption 3 holds for general ARFIMA processes for all �nite s:
As noted by Andrews & Sun (2004), if r = 0 and Assumption 3 holds with s = 2; then Assumption
A1�of Robinson (1995a) holds with � = 2.
Assumption 4 (a) fXtg is generated by the linear sequence futg
ut = A(L)"t =
1Xj=0
aj"t�j ;1Xj=0
a2j <1; (35)
3See e.g. the proof of Hurvich et al. (2005) Theorem 3.1, and their discussion of bounding d away from zero. It is
not a trivial question, as we in some sense need d to be bounded away from the boundary because the convergence of
the log-likelihood is not uniform on D � Rr:
34
where
E ["tj=t�1] = 0; E�"2t j=t�1
�= 1 a.s., (36)
E�"3t j=t�1
�= �3 a.s., (37)
E�"4t j=t�1
�= �4 a.s. 8 t = 0;�1;�2; :::; (38)
and =t�1 is the �� �eld generated by f"s : s < tg : (b) There exists a random variable " with
E"2 <1 such that for all � > 0 and some generic constant K > 0; Pr(j"tj > �) < K Pr(j"j > �):(c) In some neighborhood (0; �) of the origin �(�) is di¤erentiable
d
d��(�) = O (j�(�)j =�) as �! 0+; (39)
where �(�) =P1j=0 aj exp(ij�):
Assumption 4 says that futg is a linear sequence with martingale di¤erence innovations. That is,f"tg is adapted to the �ltration f=tg : Furthermore, Assumption 4 does not rule out non-Gaussianprocesses. It should be possible to relax the linearity assumption, see the consideration regarding
non-linearity of futg in Abadir et al. (2007).
Assumption 5 m2r+1=2
n2r!1 and m�+1=2
n�! 0 as n!1; where � = min fs; 2 + 2rg :
Assumption 5 is the same as Assumption 4 in Andrews & Sun (2004). The assumptions are needed
to show simultaneous consistency of (d; �) and asymptotic normality. Note that the �rst condition
imposes a lower bound on the growth of m which ensures simultaneous consistency of d and � by
ensuring that the scaling matrix used to normalize the score and Hessian satis�es a regularity condition
that is necessary for consistency of (d; �) which will be clari�ed later on. The second condition is to
ensure that the normalized score in distribution converges to a zero mean Gaussian process which is
required to show asymptotic normality of the estimators (d; �). Andrews & Sun (2004) instead work
with limn!1
m�+1=2
n�= A 2 (0;1) where � = min fs; 2 + 2rg. They set the divergence rate of m such that
they can derive the asymptotic bias and asymptotic mean squared error of d: We choose a bandwidth
m that diverges at a slower rate. Note that the two conditions never exclude each other as s > 2r
which follows from Assumption 3.
Assumption 6 For m = o(n) the renormalized periodogram, �0j 8 1 � j � m; satis�es a WLLN
1
m
mXj=1
�0jp! 1; as m;n!1; (40)
where �0j =Iu(�j)
'(�j)��2duj
81 � j � m.
Assumption 6 is equivalent to Assumption B in Abadir et al. (2007) and states that if Assumption
2, 3 and equation (18) hold then
�0j = �j (1 + o(1)) 8 1 � j � m as n!1: (41)
Furthermore, (24) implies that
E��0j�� C 8 1 � j � m as n!1: (42)
35
5 Consistency and asymptotic normality
Theorem 1 states consistency and asymptotic normality of the proposed ExtLPW estimator.
Theorem 1 Let fXtg be generated by (8) and assume that Assumptions 1 through 6 hold. Then, asn!1, d and � are both consistent and
Bn
d� d0� � �0
!d! N
�0;�1r
�; (43)
where Bn being the (r + 1)� (r + 1) diagonal matrix with jth diagonal element de�ned as
[Bn]11 = m1=2 (44)
[Bn]jj =
�2�m
n
�2j�2m1=2 for j = 2; 3; :::; r + 1: (45)
And r is the (r + 1)� (r + 1) covariance matrix de�ned as
r =
4 2�0r2�r �r
!: (46)
�r is a r � 1 vector with kth element �r;k and �r is an r � r matrix with (i; k)th element [�r]i;k ;
�r;k =2k
(2k + 1)2for k = 1; :::; r; (47)
[�r]i;k =4ik
(2i+ 2k + 1) (2i+ 1) (2k + 1)for i; k = 1; :::; r: (48)
We could have shown log5m�consistency (irrespective of � 2 �) using Robinson�s (1995a) pp.1642-1643 proof of the log3m�consistency of d (r = 0) and adjusting it to account for our weakerAssumption 5 compared to his Assumption 4�. But as this would mainly be a theoretical addition
it is left out. Theorem 1 utilizes the F.O.C. approach (because of the multidimensionality of the
parameter space) as in Andrews & Sun (2004). Therefore, Theorem 1 jointly delivers consistency
and asymptotic normality of (d; �): More speci�cally, since the ExtLPW likelihood ((5) where the
periodogram is de�ned by its extended DFT, i.e. (18)) is a continuous function on a compact set the
ExtLPW estimator exists. From Lemma 1 (in the Appendix) we know by Lemma 1 of Andrews & Sun
(2004) that there exists a solution to the �rst order conditions with probability tending to one, and
that the solution satis�es the convergence result in Theorem 1, see also Lemmas 1 and 2 of Andrews
& Sun (2004). If the (negative) likelihood function is strictly convex and twice di¤erentiable then the
solution to the �rst order conditions is unique and minimizes the log-likelihood and hence equals the
ExtLPW estimator.
By the formula for a partitioned inverse Theorem 1, in consequence, implies that,
�1r =
(4� 2�0r��1r 2�r)�1 �4 � 2�0r
��r � 2�r4�12�0r
��1���1r 2�r
�4� 2�0r��1r 2�r
��1 ��r � 2�r4�12�0r
� !(49)
=
cr=4 � cr
2 �0r��1r
� cr2 �
�1r �r ��1r + cr�
�1r �r�
0r��1r
!; (50)
36
where cr =�1� �0r��1r �r
��1 for r > 0 and c0 = 1:A few remarks are in order. First of all, the asymptotic variance of
pm�d� d0
�is free of nuisance
parameters and equal to cr=4: Secondly, in light of Assumption 5 the estimator given by ExtLPW for
r > 0 allows one to choose a bandwidth m much larger than in the classical LW approach, resulting
in an estimator that has asymptotic normality with a faster rate of convergence, as a function of
the sample size n: The cost of introducing a polynomial is in�ation of the asymptotic variance by a
multiplicative constant, i.e. c0 = 1; c1 = 9=4; c2 = 3:52 ; :::, see Andrews & Sun (2004).
Consistency of d provides no information about �0 as the concentrated log-likelihood becomes �at
as a function of � as n!1: The rate at which it becomes �at furthermore di¤ers for each element of�.
As discussed earlier, our model setup does not consider volatility processes, e.g. in the sense of
long memory signal-plus-noise models as in Hurvich et al. (2005) and Frederiksen, Nielsen & Nielsen
(2008), among others. Introducing a perturbation in our framework would indeed bias our estimator
in the same manner as the classical fractional integration estimators (LW and log-periodogram), i.e.
the leading bias term is of order O��2d�implying a slower convergence rate compared to the leading
bias term of the pure long memory process of O��2�, Hurvich & Ray (2003), Arteche (2004), Hurvich
et al. (2005), and Frederiksen et al. (2008).
6 Simulation study
6.1 Setup
This sections concerns the �nite sample performance of the proposed estimator. We generate I(d)
processes according to De�nition 1 by using the circulant embedding method as described in Davies &
Harte (1987), i.e. as a stationary Type I fractionally integrated process in the terminology of Marinucci
& Robinson (1999), see also Beran (1994, pp. 215-217). Non-stationary processes are then de�ned
as [d] fold partial sums of stationary I (d� [d]) processes. [d] is de�ned as the integer closest to d:Furthermore, when d� [d] = 1=2, [d] is equal to d+1=2: futg is contaminated by autoregressive (AR)and moving average (MA) roots. That is, we consider the following data generating process (DGP)
(1� �L) (1� L)d (Xt) = (1 + �L)ut;
where uti:i:d:� N (0; 1) ; � 2 f�0:8;�0:5; 0; 0:5; 0:8g and � 2 f�0:8;�0:5; 0; 0:5; 0:8g. We set the
fractional parameter of interest equal to d = f�0:3; 0; 0:3; 0:7; 1; 1:3; 1:7; 2g: Sample size is set equalto n 2 f128; 512; 1024g and bandwidth m = bnac where a 2 f0:5; 0:65; 0:8g. The bias and root
mean squared error (RMSE) were computed using 1000 replications. Simulations were done in Matlab
v7.2. The optimization procedure was implemented using the unconstrained minimization procedure
in Matlab where we used the BFGS algorithm. We tried di¤erent procedures to �nd the optimum,
among others evaluating the �rst-order conditions and thereby �nding the corresponding roots. All
the di¤erent approaches yielded similar results and we therefore elected to use the BFGS algorithm
as it is easy to implement and fairly fast computationalwise.
We compare our derived estimators to the local Whittle (LW) estimator of Robinson (1995a), local
polynomial Whittle (LPW) estimator of Andrews & Sun (2004), and extended local Whittle (ExtLW)
37
estimator of Abadir et al. (2007). Regarding the parameterization of the polynomial, we set r = f1; 2g.As initial values we set the memory parameter equal to the log-periodogram estimate of Geweke &
Porter-Hudak (1983) and the polynomial terms were all set equal to 1:
Tables 1-5 present results from the small simulation study. We only display a subset of the results.
Attention is restricted to the cases with no short-run dynamics with n = 512, Table 1, with moving-
average short-run dynamics, � 2 f�0:8;�0:5g with n = 512 , Table 2-3, and �nally with autoregressiveshort-run contamination, � 2 f0:5; 0:8g with n = 512 , Table 4-5.
6.2 Simulation results
When the DGP is not contaminated by any short-run dynamics, it is be preferable to use a larger
bandwidth. If there is short-run contamination, the opposite is the case, excluding of course the bias
reducing methods, i.e. LPW estimators. Furthermore, bias decreases as a function of sample size.
In Table 1, results without short-run contamination are presented. For the stationary region all the
estimators are seemingly unbiased, and we clearly see that the extended estimators are in a statistical
sense equal to their non-extended counterparts. The RMSE shows that the fractional parameter is
estimated quite accurately, and we notice that the estimators using a polynomial to reduce potential
bias has a larger RMSE than the estimators using a constant in a shrinking neighborhood of zero.
Moving on to the non-stationary region, i.e. d � 1=2, we see that the bias of LW and LPW increases
quite considerably, especially when d is larger than 1: This is to be expected as the LW estimator
is not consistent for d > 1 and the LW estimator is biased towards unity, thereby con�rming the
results of Phillips & Shimotsu (2004). This result for the LW estimator also seems to hold for the
LPW estimator. Clearly, in the case where d > 1, the extended estimators are the best, and with
the ExtLW estimator being the best in a bias sense, as there is no short-run contamination. For
the extended estimators, regardless of which region we are in, RMSE indicates that the fractional
parameter of interest is estimated accurately. Additionally, the RMSE does not vary much in the
given range of d:
Looking at the case where we introduce short-run contamination, Tables 2-5, we generally �nd
that the estimators are biased and the bias increases as the contamination of the signal increases.
This is expected as the low frequencies (long-run in the time domain) are contaminated by the higher
frequencies (short-run in the time domain) of the spectral density. The bias is highest when introducing
positive AR noise and negative MA noise. When � = 0:8 and � = �0:8 we clearly see the advantageof using an estimator that approximates this short-run contamination. Furthermore, when looking
at more moderate negative MA noise and positive AR noise, � = �0:5 and � = 0:5; respectively, it
is not preferable to use a lower bandwidth for the LPW (only in the stationary case) and ExtLPW
estimators, as for the other estimators. That is, the LPW and ExtLPW estimators are very robust to
MA and AR contamination because of the way they approximate the spectral density of the short-run
noise by a polynomial. Hence, it is possible to choose a higher bandwidth without increasing the bias
which is an important result especially when looking at shorter time series.
To sum up, it is important to approximate the short-run component of the local approximation by a
polynomial function instead of merely a constant, especially when there is a high degree of persistence,
since the polynomial estimators are clearly less biased than the LW and ExtLW. This is especially
38
important in shorter time series as the bias can be extreme when there is short-run contamination. As
shown by Andrews & Sun (2004) and in this paper, the improved bias comes at a cost of increasing
the variance by a multiplicative constant. When looking at the non-stationary region, it is important
to use the extended versions especially when d � 1 as there is considerable bias gains from using these
extensions.
For the ExtLW estimator where � = f�0:5; 0; 0:5g Abadir et al. (2007) arrive at similar results.Furthermore, the simulation results for the ExtLW estimator from Abadir et al. (2007) when n = 500
and m =�n:65
�are similar to the results obtained in Shimotsu & Phillips (2005) for their exact local
Whittle estimator (ELW). Shimotsu & Phillips (2005) compare their ELW estimator to two di¤erent
types of tapered estimators (the tapering proposed in Velasco (1999a) and Hurvich & Chen (2000)),
and conclude that their estimator is the best general purpose estimator when compared to the tapered
version of the LW estimator. Therefore, we conclude that our proposed estimator also outperforms
the tapered LW estimators especially in the presence of short-run contamination.
7 Application to credit spreads
In this section, we investigate potential long memory properties of treasury yield and yields on cor-
porate bonds, spreads over treasury and spreads between corporate yields, as previously examined by
Ratta & Urga (2005). Both in structural models (Merton (1974), Black & Cox (1976), Das (1995),
Longsta¤& Schwartz (1995), Hull & White (1995), Leland & Toft (1996), among others) and reduced-
form models (Ramaswamy & Sundaresan (1986), Jarrow & Turnbull (1995), Das & Tufano (1995),
Du¢ e & Huang (1996), Jarrow, Lando & Turnbull (1997), Madan & Unal (1996), Du¢ e & Single-
ton (1999), among others), credit spreads play an integral role in pricing of risky debt and credit
derivatives. Neither approach considers that the process driving the data generating process might
be poorly approximated by considering the classical I(0)=I(1) setup, as discussed in Ratta & Urga
(2005), see references therein. The objective of Ratta & Urga (2005) was to �ll a gap in the credit
spread literature, i.e. to investigate if credit spreads exhibit potential fractional integration and if
there are some long-run relations that can be explained through fractional cointegration. They use
the log-periodogram estimator of Geweke & Porter-Hudak (1983) and the LW estimator analyzed by
Robinson (1995a). As both of these estimators are severely biased in the presence of short-run conta-
mination, see Nielsen & Frederiksen (2005) for a simulation study, and there is no asymptotic theory
for d � 3=4; we suggest using more up-to-date semiparametric estimators that potentially mitigate thebias introduced by short-run contamination and where the distributional theory holds for d � 3=4 :
The usual way to reduce bias for the log-periodogram and the LW estimator is to select a smaller
bandwidth thereby sacri�cing e¢ ciency in the form of a larger variance.
The data considered here consists of daily observations for the 30 year historical US Treasury
Constant Maturity Yields and Moody�Aaa and Baa.4 For a more thorough description of the data
and our reason for using rating-speci�c indices, see Ratta & Urga (2005). Our data covers the period
4Ratta & Urga (2005) look at two other ratings besides the two that we consider, i.e., Aa and A. The
reason we only look at Aaa and Baa is that these data series are downloadable from the Federal Reserve at
http://www.federalreserve.gov/releases/h15/data.htm
39
2nd of January 1986 through 15th of February 2002 for a total of 4,034 observations. The 30-year
Treasury constant maturity series was discontinued on February 18th, 2002, and reintroduced on
February 9th, 2006. We could have used the 20-year Treasury constant maturity series and used a
correction factor delivered by the U.S. Treasury, but we choose to focus on the shorter sample period.
As opposed to Ratta & Urga (2005) we opt to log transform5 the series before considering further
analysis. Therefore, spreads, i.e. spreads over treasury (sAaaTreas, sBaaTreas) and spreads between
corporate yields (sBaaAaa), are de�ned as the di¤erence between the logs of the respective series.
Time series plot of the individual series and the spreads are shown in Figure 1. There are signs of
heteroskedasticity, volatility clustering and potential structural breaks. Granger & Terasvirta (1999)
show that the number of regime switches a¤ects the long memory parameter. Diebold & Inoue (2001),
Granger & Hyung (2004) and Haldrup & Nielsen (2007) discuss that if series display breaks, particular
in their deterministic components, these processes will give the impression of persistence. That is, we
can mistakenly conclude that a process displays long memory, where in fact it is due to a structural
break in the series. Therefore, we split the full sample in four even subperiods. The results from looking
at subperiods were comparable to the whole sample period, and therefore omitted. Additionally, we
implemented a test for spurious long memory where we temporally aggregated the data and compared
the long memory estimates through a Wald type test for identical memory across aggregation, see
Ohanissian, Russell & Tsay (2008) and Frederiksen & Nielsen (2008). We could not reject that
the memory parameters are identical. Hence, we conjecture that the estimated long memory is not
spurious in the sense that it is generated by structural breaks, e.g. a non-stationary level shift in mean
DGP. Looking at �rst di¤erences of the respective series, they seem stationary (when looking at the
autocorrelation diagrams which are omitted). Especially, the spread series look as if they have been
overdi¤erenced, i.e. the introduction of moving average behavior in the autocorrelation diagram.
Insert Figure 1 about here
Figures 2-7 display the semiparametric results for the LW, LPW, ExtLW and ExtLPW estimators,
for di¤erent bandwidth ranges.
Insert Figures 2-7 about here
Generally, results for the fractional integration estimates show that the estimators that do not model
the short-run components by a polynomial have a tendency to decrease as a function of the bandwidth,
at least for su¢ ciently large bandwidth. This is of course reasonable considering the theoretical
properties of these estimators.
The logs of Aaa, Baa and Treasury yields are in the non-stationary area with the long memory
parameter estimated in the proximity of a unit root. As the asymptotic theory does not hold for the
LPW estimator when d � 1=2 and for the LW estimator when d � 3=4, we primarily rely on the
extended estimators. In general, we cannot reject that the log yields contain a unit root.
Looking at the spreads over treasury (sAaaTreas, sBaaTreas) and spreads between corporate yields
(sBaaAaa), the estimated long memory is clearly in the non-stationary region regardless of the chosen
5Log transforming the data is also preferred in the sense that it better captures the non-linear relationship between
yields and ratings, Manzoni (2002).
40
bandwidth and estimator. The LW and ExtLW are for larger bandwidth choices signi�cantly di¤erent
from d = 1; whereas we cannot reject the presence of a unit root for the polynomial estimators.
Like Ratta & Urga (2005), we have also applied a parametric ARFIMA-GARCH.6 The results
con�rm the �ndings obtained from the semiparametric analysis, so they are omitted. If indeed the
true generating process is modeled by GARCH innovations this does not a¤ect the asymptotic theory
of the semiparametric estimates as shown in a simulation study by Nielsen & Frederiksen (2005), so,
in that respect, it is not unreasonable that the conclusions are the same.
Overall, it cannot be rejected that the log yields of Aaa, Baa and Treasury bonds contain a unit
root. However, the results are more mixed when looking at spreads, depending on the estimator and
bandwidth choice. Therefore, as in Ratta & Urga (2005), we can reject the reduced-form modeling of
Das & Tufano (1995), Jarrow et al. (1997) and Du¢ e & Singleton (1999). This explicitly implies the
data generating process of the risk-free process, and hence also credit spreads, follows a short-memory
process, i.e. I(0). The relevance of modeling yields in a more �exible fractional cointegration setup
should be considered and is at least a relevant alternative to the classical I(1)=I(0) terminology.
8 Concluding remarks
In this paper, we propose a semiparametric estimator that circumvents the relatively slow convergence
and �nite sample bias of the classical local Whittle estimator when there is short-run contamination
(e.g. autoregressive and/or moving average roots) and non-stationarity. The bias reduction is ob-
tained by approximating the spectrum of the short memory component by a polynomial instead of
a constant in a shrinking neighborhood of frequency zero. In addition, the notion of extended DFT
and periodogram is used to extend the estimator to cover non-stationary values of the fractional inte-
gration parameter d. We show consistency and asymptotic normality of the estimator. A simulation
study con�rms the asymptotic results. The adequacy of the estimator is shown through an empirical
analysis of credit spreads.
As a �nal note, we could also have opted to expand the work of Andrews & Sun (2004) by utilizing
the work of Shimotsu & Phillips (2005). However, we conjecture that such an estimator would in fact
be consistent and asymptotically normal in the same manner as the exact local Whittle estimator of
Shimotsu & Phillips (2005). Robinson (2005) showed that the expected squared deviation between
the DFT of the Type I and the Type II model is of order O�n�1
�: Therefore, we conjecture that the
derived results also hold for Type II fractional processes.
9 Appendix of proofs and lemmas
The proof to Theorem 1 relies heavily on Abadir et al. (2007) and Andrews & Sun (2004).
The Appendix section is structured as follows: In the �rst section the proof to Theorem 1 is given.
Section 2 presents technical lemmas adapted from Andrews & Sun (2004) and Abadir et al. (2007).
6We also estimated other GARCH speci�cations, e.g., IGARCH and FIGARCH. These other speci�cations seem
indeed to be justi�ed in the sense that � + � � 1 in the GARCH(1,1) speci�cation. A further analysis is beyond the
scope of this paper.
41
9.1 Proof of Theorem 1
Proof. Set
Dm (�) =�d 2 [d1; d2] :
�log5m
�jd� d0j < �
for � > 0; (51)
gj = ��2dj G exp(�Pr (�j ; �)) (52)
As in Andrews & Sun (2004) denote the score and the Hessian of the scaled objective function as
Sn(d; �) = mrLn(d; �) and Hn(d; �) = mr2Ln(d; �), respectively.
Sn(d; �) = G�1(d; �)mXj=1
Ij(d) exp (Pr (�j ; �))�
2dj �m�1
mXk=1
Ik(d) exp (Pr (�k; �))�2dk
!(53)
��2 log j; �2j ; :::; �
2rj
�0= G�1(d; �)
mXj=1
GIj(d)
gj(d; �)�m�1
mXk=1
GIk(d)
gk(d; �)
!Xj : (54)
Hn(d; �) = G�2 (d; �)
0BB@G (d; �)
Pmj=1 Ij(d) exp (Pr (�j ; �))�
2dj
�2 log j; �2j ; :::; �
2rj
�0 �2 log j; �2j ; :::; �
2rj
��m
�m�1Pm
j=1 Ij(d) exp (Pr (�j ; �))�2dj
�2 log j; �2j ; :::; �
2rj
�0���m�1Pm
j=1 Ij(d) exp (Pr (�j ; �))�2dj
�2 log j; �2j ; :::; �
2rj
�0�01CCA
= G�2 (d; �)
0@ G (d; �)Pmj=1
GIj(d)gj(d;�)
X 0jXj �m
�m�1Pm
j=1GIj(d)gj(d;�)
Xj
���m�1Pm
j=1GIj(d)gj(d;�)
Xj
�01A ; (55)
whereXj =�2 log j; �2j ; :::; �
2rj
�0:De�ne the deterministic scaling matrixBn equal to the (r+1)�(r + 1)
diagonal matrix with jth diagonal element de�ned as
[Bn]11 = m1=2 (56)
[Bn]jj =
�2�m
n
�2j�2m1=2 for j = 2; 3; :::; r + 1: (57)
Since Ln(d; �) is a continuous function de�ned on a compact set the estimator exists. Strict convexity
of the (negative) log-likelihood, Ln (:), implies that the estimator is unique. Then, by strict convexity
and twice continuous di¤erentiability of Ln(:), implies that if a solution to the F.O.C. exists with
probability tending to one, which essentially follows by Andrews & Sun (2004, Lemma 1), then it is
unique and minimizes the objective function. Now we can use Lemma 1 to verify that the conditions
in Lemma 1 of Andrews & Sun (2004) hold. Andrews & Sun (2004, Lemma 1(i)) holds by Assumption
5. Andrews & Sun (2004, Lemma 1(ii)) holds by Lemma 1(e) and the second condition in Assumption
5. Andrews & Sun (2004, Lemma 1(iii)) holds by Lemma 1(a) and Lemma 1(b) and the positive
de�niteness of r: Andrews & Sun (2004, Lemma 1(iv)) holds by Lemma 1(c) and Lemma 1(d) as
it ensures for some sequence �n that goes su¢ ciently slowly to zero, Cn ! 1, holds.7. Thus, whatremains is to show strict convexity. We know that if for all leading principal minors Dl(d; �) > 0,
l = 1; 2; :::; 1 + r + 1 and 8 (d; �) 2 D �� � [d1; d2]� Rr+1 then it follows that (negative) Ln(d; �) is7Andrews & Sun (2004), give an example of such a sequence, i.e., setting �n = log
�1m:
42
strictly convex on D�� � [d1; d2]�Rr+1: Andrews & Sun (2004) prove this by noticing that for anyc 2 Rr+1n f0g
c0Hn(d; �)cG2(d; �)m�1 = G(d; �)m�1
mXj=1
GIj(d)
gj(d; �)(c0Xj)
2 �
0@m�1mXj=1
GIj(d)
gj(d; �)c0Xj
1A2= a0a � b0b� (a0b)2 > 0; (58)
where a and b are vectors of order m with aj =�m�1GIj(d)
gj(d;�)
�1=2and bj =
�m�1GIj(d)
gj(d;�)
�1=2c0Xj(d; �)
for j = 1; :::;m and the inequality holds by the Cauchy-Schwarz inequality.
9.2 Lemmas
Lemma 1 is the same as Lemma 2 in Andrews & Sun (2004). Part (e) is lacking the bias term as
we impose a weaker form of divergence of the bandwidth m in Assumption 5 than Andrews & Sun
(2004) do. Otherwise, the proof follows from Andrews & Sun (2004) with modi�cations to allow for
d � 1=2. These modi�cations follow Abadir et al. (2007) and there notion of the extended DFT
and periodogram. Lemma 2 is adapted from Lemma 4.6 in Abadir et al. (2007) and deals with the
asymptotic properties of the renormalized DFT�s and hence generalizes Theorem 2 of Robinson (1995b,
Theorem 2) as done in Abadir et al. (2007) to suit the non-stationary case. Furthermore, we will use
Lemma 4.2 and Lemma 4.4 of Abadir et al. (2007) extensively. In short, Lemma 4.2 gives relevant
bounds for proving consistency of the estimator d. Lemma 4.4 gives the algebraic relation between
the DFT of the series fXtg and the di¤erenced series f�pXtg :
Lemma 1 Under Assumptions 1-6, as n!1; we have
(a) B�1n JnB�1n
p! r
(b) B�1n (Hn(d0; �0)� Jn)B�1n
= op(1)(c) sup
�2�
B�1n (Hn(d0; �)�Hn(d0; �0))B�1n = op(1)
(d)sup
(d;�)2Dm(�n)��
B�1n (Hn(d; �)�Hn(d0; �))B�1n = op(1) for all
sequences of constants f�ngn�1 for which �n = o(1)(e) B�1n Sn(d0; �0)
d! N(0;r);
Proof of (a). Follows by approximating sums by integrals, see Andrews & Sun (2004, pp. 597),
where they refer to Andrews & Guggenberger (2003) and Lemma 2(a), (h), and (i).
Proof of (b). As in Andrews & Sun (2004) write with the only di¤erence that our extended
periodogram of a time series fXtg depends not only on the Fourier frequencies, but also on the valueof d, i.e. Ij (d) = I(�j ; d) = j!(�j ; d)j2
Ga;b (d; �) = m�1mXj=1
Ij(d) exp (Pr (�j ; �))�2dj (2 log j)
a
�j
m
�2b(59)
Ja;b = G0m�1
mXj=1
(2 log j)a�j
m
�2b; (60)
43
for a = 0; 1; 2, b = 0; :::; r and where we in de�ning Ja;b have substituted Ij (d) exp (Pr (�j ; �))�2djwith G0: As in (A.7) in Andrews & Sun (2004), the (1; 1) ; (1; k) and (k; i) element of B�1n HnB
�1n and
B�1n JnB�1n for k; i = 2; :::; r + 1 are then given by
G�20;0
�G0;0G2;0 � G21;0
�; (61)
G�20;0
�G0;0G1;k�1 � G1;0G0;k�1
�;
G�20;0
�G0;0G1;k+i�2 � G0;k�1G0;i�1
�;
where for B�1n JnB�1n just substitute Ga;b (d; �) with Ja;b. To prove Lemma 1(b), it then su¢ ces to
show that
�a;b =
�����Ga;b (d0; �0)G0� Ja;bG0
����� = op �log�2m� , (62)
8a = 0; 1; 2 and b = 0; 1; :::; r. Write
�a;b =
�������m�1Pm
j=1 Ij(d0) exp (Pr (�j ; �0))�2d0j (2 log j)a
�jm
�2bG0
�G0m
�1Pmj=1 (2 log j)
a�jm
�2bG0
�������=
�������m�1mXj=1
Ij(d0) exp (Pr (�j ; �0))�2d0j (2 log j)a
�jm
�2bgj exp (Pr (�j ; �0))�
2d0j
�m�1mXj=1
(2 log j)a�j
m
�2b�������=
������m�1mXj=1
Ij(d0)
gj(2 log j)a
�j
m
�2b�m�1
mXj=1
(2 log j)a�j
m
�2b������=
������m�1mXj=1
�Ij(d0)
gj� 1�(2 log j)a
�j
m
�2b�������
������m�1m�1Xj=1
"(2 log k)a
�k
m
�2b� (2 log (k + 1))a
�k + 1
m
�2b# kXj=1
�Ij(d0)
gj� 1�������
+
������(2 logm)am�1mXj=1
�Ij(d0)
gj� 1�������
: = �1;a;m + �2;a;m;
where the inequality follows from using summation by parts. Furthermore, under Assumption 1, i.e.
d0 6= p0 � 1=2 for p0 2 Z and the de�nition of the extended DFT, the assumption of linearity ofthe generating process (Assumption 4(a)), and together with Abadir et al. (2007, Lemma 4.4) and
Lemma 2, implies that the behavior of the extended DFT and periodogram are the same for all
d 2 (�1=2;1) : Therefore, the results from Andrews & Sun (2004) also hold in our case. That is, the
proof of (62) follows by collecting the terms (A.11)-(A.13) in Andrews & Sun (2004, pp. 598-599) and
using Assumption 5
�1;a;m + �2;a;m = Op
�(logam)m�1=2 + (logam)m�n��
�= op
�log�2m
�:
44
Proof of (c). By (62) and Ja;b = O (logam) ; we get that
Ga;b (d0; �0) = Op (logam) ; (63)
for a = 0; 1; 2 and b = 0; :::; r and
G0;0 (d0; �0) = G0 + op�log�2m
�; (64)
where G0 > 0. Then given that we can write the elements for B�1n HnB�1n as in (61) and the above
results hold, it su¢ ces to show that
sup�2�
���Ga;b (d0; �)� Ga;b (d0; �0)��� = op �log�2m� ; (65)
8a = 0; 1; 2 and b = 0; :::; r: Write the left-hand side of (65) as
sup�2�
�������m�1Pm
j=1 Ij (d0) exp (Pr (�j ; �))�2d0j (2 log j)a
�jm
�2b�m�1Pm
j=1 Ij (d0) exp (Pr (�j ; �0))�2d0j (2 log j)a
�jm
�2b�������
= sup�2�
������m�1mXj=1
Ij (d0) [exp (Pr (�j ; �))� exp (Pr (�j ; �0))]�2d0j (2 log j)a�j
m
�2b������� sup
�2�sup
k=1;:::;mjexp (Pr (�k; �))� exp (Pr (�k; �0))� 1j
�m�1mXj=1
Ij (d0) exp (Pr (�j ; �0))�2d0j (2 log j)a
= O��2m�Ga;0 (d0; �0)
= Op
�(m=n)2 (logam)
�= op
�log�2m
�: (66)
The second equality holds by a mean-value expansion using the compactness of �, the third equation
holds by (62) and Ja;b = O (logam). The last equality holds by Assumption 5.
Proof of (d). Given the same arguments as in Andrews & Sun (2004), we note that, (i) utilizing
equations (62) and (65) we have that Ga;b (d0; �) = Ja;b + op�log�2m
�(ii) Ja;b = O (logam) ; (iii)
J0;0J2;0 � J21;0 = O(1) by replacing sums by integrals and noting that the part of J0;0J2;0 that is
O�log2m
�cancels with an identical term in J21;0, (iv) J0;0J1;k�1 � J1;0J0;k�1 = O(1) by the same
argument as in (iii), and (v) J0;0 = G0 > 0: Then from (i)-(v) and equation (61) it su¢ ces to show
sup(d;�)2Dm(�n)��
���Ga;b (d; �)� Ga;b (d0; �)��� = op �log�2m� : (67)
Replacing �2dj with j2d in Ga;b (d; �), and thereby de�ning Ea;b (d; �), equation (61) also holds for
Ga;b (d; �) replaced by Ea;b (d; �) : Hence, it su¢ ces in proving Lemma 1(d) that
Za;b = sup(d;�)2Dm(�n)��
���Ea;b (d; �)� Ea;b (d0; �)��� = op �n2d0 log�2m� ; (68)
45
8a = 0; 1; 2 and b = 0; :::; r: Then from Andrews & Sun (2004, pp. 600), 9C < 1 and for (d; �) 2Dm (�n)��
Za;b = sup(d;�)2Dm(�n)��
������m�1mXj=1
Ij (d0) exp (Pr (�j ; �)) (2 log j)a
�j
m
�2bj2d0
�j2(d�d0) � 1
�������� C sup
d2Dm(�n)m�1
mXj=1
Ij (d0) (log j)a j2d0
���j2(d�d0) � 1���+ op (1)� 2C exp
�2�n log
�4m�
supd2Dm(�n)
m�1mXj=1
Ij (d0) (log j)a+1 j2d0 jd� d0j
� �n�log�2m
�2C exp
�2�n log
�4m�m�1
mXj=1
Ij (d0)�2d0j
�2�
n
��2d0:
The �rst inequality follows from using sup0���2�;�2�
supj=1;:::;m
exp (Pr (�j ; �)) < 1 because � is compact.
The second inequality stems from noting��j2(d�d0) � 1��jd� d0j
� 2m2jd�d0j log j � 2m2�n log�5m log j = 2 exp
�2�n log
�4m�log j
for d 2 Dm (�n) by a mean-value expansion where we use that mlog�1m = e: The third inequality
uses d 2 Dm (�n) : Then from equations (62) and (65) we have m�1Pmj=1 Ij (d0)�
2d0j = G0;0(d0; 0) =
G0 + op�log�2m
�: Hence, (68) follows.
Proof of (e). By using (62), and setting a = b = 0 we get G (d0; �0) = G0�1 + op
�log�2m
��;
and therefore the normalized score can be written as
B�1n Sn (d0; �0) = G�1 (d0; �0)m�1=2
mXj=1
Ij (d0)
gj (d0; �0)�m�1
mXk=1
Ik (d0)
gk (d0; �0)
!~Xj
= (1 + op (1))m�1=2
mXj=1
�Ij (d0)
gj (d0; �0)� 1�
~Xj �m�1mXk=1
~Xk
!; (69)
where~Xj =
�log j; (j=m)2 ; :::; (j=m)2r
�0: (70)
46
Therefore, omitting the small order terms write the RHS of (69) as Andrews & Sun (2004, pp. 601),
T1;n + T2;n + T3;n + T4;n; where
T1;n = m�1=2mXj=1
�Ij (d0)
gj (d0; �0)� 2�I" (�j)� E
�Ij (d0)
gj (d0; �0)� 2�I" (�j)
��(71)
� ~Xj �m�1
mXk=1
~Xk
!;
T2;n = m�1=2mXj=1
�E[Ij (d0)]
fj(d0)� 1�
fj (d0)
gj(d0; �0)
~Xj �m�1
mXk=1
~Xk
!; (72)
T3;n = m�1=2mXj=1
(2�I" (�j)� 1) ~Xj �m�1
mXk=1
~Xk
!; (73)
T4;n = m�1=2mXj=1
�fj (d0)
gj(d0; �0)� 1�
~Xj �m�1mXk=1
~Xk
!; (74)
using that E (2�I" (�j)) = 1: Next, we need to show that T1;n and T4;n are op(1), T2;n = o (1), and
T3;nd! N (0;r) : To show, T1;n = op(1); use summation by parts
T1;n = m�1=2m�1Xk=1
�~Xk � ~Xk+1
� kXj=1
�Ij (d0)
gj (d0; �0)� 2�I" (�j)� E
�Ij (d0)
gj (d0; �0)� 2�I" (�j)
��
+
~Xm �m�1
mXk=1
~Xk
!m�1=2
mXj=1
�Ij (d0)
gj (d0; �0)� 2�I" (�j)� E
�Ij (d0)
gj (d0; �0)� 2�I" (�j)
��
= m�1=2m�1Xk=1
O�k�1
�Op
�k1=3 log2=3 k + k�+1=2n�� + k1=2n�1=4
�+O(1)m�1=2Op
�m1=3 log2=3m+m�+1=2n�� +m1=2n�1=4
�= Op
�m�1=6 log2=3m+ (m=n)� + n�1=4
�= op(1); (75)
which follows from noting that ~Xk� ~Xk+1 = O�k�1
�uniformly over k = 1; :::;m and ~Xm�m�1Pm
k=1~Xk =
O(1) which follows from approximating sums by integrals, see Andrews & Sun (2004, pp. 602). A
remark is in order. Remember that under the assumption of linearity of the generating process, As-
sumption 4(a), together with Abadir et al. (2007, Lemma 4.4) and Lemma 2(ii), says that the behavior
is the same for all d 2 (�1=2;1) : Therefore, the results from Andrews & Sun (2004) also hold in
our case. Since d0 belongs to the interior of the admissible parameter space, T1;n = op(1): To prove
that T2;n = o(1), we again utilize Assumption 4(a) together with Abadir et al. (2007, Lemma 4.4) and
Lemma 2(ii) that enables us to use the result that
E
�Ij (d0)
fj (d0)
�= 1 +O
�j�1 log j
�; (76)
47
where o(1)! 0 uniformly over 1 � j � m as n!1: Then using (76), T2;n is bounded by
T2;n = m�1=2mXj=1
O�j�1 log j
�O (1)
~Xj �m�1
mXk=1
~Xk
!(77)
= O
0@m�1=2 logmmXj=1
j�1 log j
1A= O
�m�1=2 log3m
�;
where we have used that ~Xj �m�1Pmk=1
~Xk = O (logm) uniformly in 1 � j � m: Therefore, T2;n =o(1). Next, we need to show that 8� 6= 0 �0T3;n
d! N(0; �0r�): That is, we need to verify that for
n!1m�1
mXj=1
&2j ! �0r�; (78)
where &j = �0j�~Xj �m�1Pm
k=1~Xk
�and r =
4 2�0r2�r �r
!which follows from Lemma 1(a), Lemma
1(d) and �nally noting that j&j � &j+1j � k�k ~Xj � ~Xj+1
� Cj�1 for some constant C > 0 inde-
pendent of j: Finally, we need to show that T4;n = op(1): This follows from summation by parts andfj(d0)gj(d0;�0)
� 1 = O�(j=n)�
�uniformly on 1 � j � m; Frederiksen et al. (2008). This implies
T4;n = m�1=2m�1Xk=1
�~Xk � ~Xk+1
� kXj=1
�fj (d0)
gj (d0; �0)� 1�
+
~Xm �m�1
mXk=1
~Xk
!m�1=2
mXj=1
�fj (d0)
gj (d0; �0)� 1�
= m�1=2m�1Xk=1
O�k�1
� kXj=1
O�(j=n)�
�+O(1)m�1=2
mXj=1
O�(j=n)�
�= O
�m�1=2+�n��
�= op(1); (79)
where the last equality holds by Assumption 5.
Lemma 2 Assume that the sequence fvjg is given as in (21). The following holds uniformly in
1 � k < j � m = o(n); as n!1: (i) If fu satis�es Assumption 2, then
Ehjwu (�j)j2 =fu (�j)
i= 1 + o(1) +O
�j�1 log j
�; (80)
where o(1)! 0 uniformly in 1 � j � m; as n!1; and
jE [vjvj ]j+ jE [vjvk]j = O
�log j
j � k
�+O
�log j
kjdjj1�jdj
�; (81)
jE [vjvj ]j = O
�log j
j
�: (82)
48
(ii) If fu satis�es Assumption 2 and 4(c), then
Ehjwu (�j)j2 =fu (�j)
i= 1 +O
�j�1 log j
�; (83)
and
jE [vjvj ]j+ jE [vjvk]j = O
�log j
kjdjj1�jdj
�; (84)
jE [vjvj ]j = O
�log j
j
�: (85)
Proof. Follows from Abadir et al. (2007) and their proof to Lemma 4.6, given Assumption 3 and
by interchanging b0 by G0 exp (�Pr (�j ; �)). Let d0 = p0 + du: Then for d0 = du equations (80)-(82)and (83)-(85) follow from Robinson (1995b) and his proof of Theorem 2 pp. 1060. For p0 2 Nn f0gand the property of the extended DFT and the rescaled extended DFT, (18) and (21), respectively, it
follows
vj =
�1� exp (i�j)
�j
��p0 wu (�j)
' (�)1=2 ��duj
: (86)
As���1�exp(i�j)�j
����p0 � C uniformly in 1 � j � m (81)-(82) and (84)-(85) also hold for p0 2 Nn f0g :
References
Abadir, K. M., Distaso, W. & Giraitis, L. (2007), �Nonstationarity-extended local whittle estimation�,
Journal of Econometrics 141, 1353�1384.
Abadir, K. M. & Taylor, R. (1999), �On the de�nitions of (co-)integration�, Journal of Time Series
Analysis 20, 129�13.
Andrews, D. W. K. & Guggenberger, P. (2003), �A bias-reduced log-periodogram regression estimator
for the long memory parameter�, Econometrica 71, 675�712.
Andrews, D. W. K. & Sun, Y. (2004), �Adaptive local polynomial Whittle estimation of long-range
dependence�, Econometrica 72, 569�614.
Arteche, J. (2004), �Gaussian semiparametric estimation in long memory in stochastic volatility and
signal plus noise models�, Journal of Econometrics 119, 131�154.
Beran, J. (1994), Statistics for Long-Memory Processes, Chapman-Hall, New York.
Black, F. & Cox, J. C. (1976), �Valuing corporate securities: Some e¤ects of bond indenture provisions�,
Journal of Finance 31, 351�67.
Dalla, V., Giraitis, L. & Hidalgo, J. (2006), �Consistent estimation of the memory parameter for
nonlinear time series�, Journal of Time Series Analysis 27, 211�251.
Das, S. (1995), �Credit risk derivatives�, The Journal of Derivatives 2, 7�23.
49
Das, S. & Tufano, P. (1995), �Pricing credit-sensitive debt when interest rates, credit ratings and credit
spreads are stochastic�, Journal of Financial Engineering 5, 161�198.
Davies, R. B. & Harte, D. S. (1987), �Tests for hurst e¤ects�, Biometrika 74, 95�102.
Diebold, F. X. & Inoue, A. (2001), �Long memory and regime switching�, Journal of Econometrics
105, 131�159.
Du¢ e, D. & Huang, M. (1996), �Swap rates and credit quality�, Journal of Finance 51, 921�49.
Du¢ e, D. & Singleton, K. J. (1999), �Modeling term structures of defaultable bonds�, Review of
Financial Studies 12, 687�720.
Frederiksen, P. H. & Nielsen, F. S. (2008), �Testing for spurious long memory in potentially nonstation-
ary perturbed fractional processes�, CREATES RP 2008-59, University of Aarhus, and Working
Paper, Nordea Markets .
Frederiksen, P. H., Nielsen, F. S. & Nielsen, M. Ø. (2008), �Local polynomial Whittle estimation
of perturbed fractional processes�, CREATES RP 2008-29, University of Aarhus, and Working
Paper, Nordea Markets and Cornell University .
Geweke, J. & Porter-Hudak, S. (1983), �The estimation and application of long-memory time series
models�, Journal of Time Series Analysis 4, 221�238.
Granger, C. W. J. & Hyung, N. (2004), �Occasional structural breaks and long memory with an
application to the s&p 500 absolute stock returns�, Journal of Empirical Finance 11, 399�421.
Granger, C. W. J. & Terasvirta, T. (1999), �A simple nonlinear time series model with misleading
linear properties�, Economics Letters 62, 161�165.
Haldrup, N. & Nielsen, M. Ø. (2007), �Estimation of fractional integration in the presence of data
noise�, Computational Statistics and Data Analysis 51, 3100�3114.
Hull, J. & White, A. (1995), �The impact of default risk on the prices of options and other derivative
securities�, Journal of Banking & Finance 19, 299�322.
Hurvich, C. M. & Chen, W. W. (2000), �An e¢ cient taper for potentially overdi¤erenced long-memory
time series�, Journal of Time Series Analysis 21, 155�180.
Hurvich, C. M., Moulines, E. & Soulier, P. (2005), �Estimating long memory in volatility�, Economet-
rica 73, 1283�1328.
Hurvich, C. M. & Ray, B. K. (1995), �Estimation of the memory parameter for nonstationary or
noninvertible fractionally integrated processes�, Journal of Time Series Analysis 16, 17�41.
Hurvich, C. M. & Ray, B. K. (2003), �The local Whittle estimator of long-memory stochastic volatility�,
Journal of Financial Econometrics 1, 445�470.
50
Jarrow, R. A., Lando, D. & Turnbull, S. M. (1997), �A markov model for the term structure of credit
risk spreads�, Review of Financial Studies 10, 481�523.
Jarrow, R. A. & Turnbull, S. M. (1995), �Pricing derivatives on �nancial securities subject to credit
risk�, Journal of Finance 50, 53�85.
Kim, C. S. & Phillips, P. C. B. (2006), Log periodogram regression: The nonstationary case, Cowles
Foundation Discussion Papers 1587, Cowles Foundation, Yale University.
Künsch, H. R. (1987), Statistical aspects of self-similar processes, in Y. Prokhorov & V. V. Sazanov,
eds, �Proceedings of the First World Congress of the Bernoulli Society�, VNU Science Press,
Utrecht, pp. 67�74.
Lahiri, S. N. (2003), �A necessary and su¢ cient condition for asymptotic independence of discrete
fourier transforms under short- and long-range dependence�, Annals of Statistics 31, 613�641.
Leland, H. E. & Toft, K. B. (1996), �Optimal capital structure, endogenous bankruptcy, and the term
structure of credit spreads�, Journal of Finance 51, 987�1019.
Longsta¤, F. A. & Schwartz, E. (1995), �Valuing credit derivatives�, Journal of Fixed Income 5, 6�12.
Madan, D. & Unal, H. (1996), Pricing the risks of default, Center for Financial Institutions Working
Papers 94-16, Wharton School Center for Financial Institutions, University of Pennsylvania.
Manzoni, K. (2002), �Modeling credit spreads: An application to the sterling eurobond market�,
International Review of Financial Analysis 11, 183�218.
Marinucci, D. & Robinson, P. M. (1999), �Alternative forms of fractional brownian motion�, Journal
of Statistical Planning and Inference 80, 111�122.
Merton, R. C. (1974), �On the pricing of corporate debt: The risk structure of interest rates�, Journal
of Finance 29, 449�70.
Newey, W. K. & McFadden, D. (1986), Large sample estimation and hypothesis testing, in R. F. Engle
& D. McFadden, eds, �Handbook of Econometrics�.
Nielsen, M. Ø. & Frederiksen, P. H. (2005), �Finite sample comparison of parametric, semiparametric,
and wavelet estimators of fractional integration�, Econometric Reviews 24, 405�443.
Ohanissian, A., Russell, J. R. & Tsay, R. S. (2008), �True or spurious long memory? a new test�,
Journal of Business & Economic Statistics 26(2), 161�175.
Phillips, P. C. B. (1999), �Discrete fourier transformation of fractional processes�, Discussion Paper
1243(Yale University (Cowles Foundation)).
Phillips, P. C. B. & Shimotsu, K. (2004), �Local whittle estimation in nonstationary and unit root
cases�, The Annals of Statistics 32, 656�692.
51
Ramaswamy, K. & Sundaresan, S. M. (1986), �The valuation of �oating-rate instruments : Theory
and evidence�, Journal of Financial Economics 17, 251�272.
Ratta, L. C. & Urga, G. (2005), Modeling credit spreads: A fractional integration approach, Working
Paper CEA-07-2005, Cass Business School, City University London.
Robinson, P. M. (1994), �Semiparametric analysis of long-memory time series�, The Annals of Statistics
22, 515�539.
Robinson, P. M. (1995a), �Gaussian semiparametric estimation of long range dependence�, The Annals
of Statistics 23, 1630�1661.
Robinson, P. M. (1995b), �Log-periodogram regression of time series with long range dependence�, The
Annals of Statistics 23, 1048�1072.
Robinson, P. M. (2005), �The distance between nonstationary fractional processes�, Journal of Econo-
metrics 128, 195�236.
Shimotsu, K. (2006), �Exact local whittle estimation of fractional integration with unknown mean and
time trend�, Working Paper, Department of Economics, Queen�s University, Canada (1061).
Shimotsu, K. & Phillips, P. (2005), �Exact local whittle estimation of fractional integration�, The
Annals of Statistics 33, 1890�1933.
Tanaka, K. (1999), �The nonstationary fractional unit root�, Econometric Theory 15, 549�582.
Velasco, C. (1999a), �Gaussian semiparametric estimation of non-stationary time series�, Journal of
Time Series Analysis 20, 87�127.
Velasco, C. (1999b), �Non-stationary log-periodogram regression�, Journal of Econometrics 91, 325�371.
52
Table 1: Simulation results for ARFIMA(0,d,0) with n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)
d Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE
Panel A: m =�n0:5
�-0.3 -0.0087 0.1450 -0.0490 0.2708 -0.0546 0.3984 -0.0059 0.1385 -0.0309 0.2459 -0.0219 0.37690 -0.0123 0.1385 -0.0548 0.2659 -0.0616 0.3958 -0.0119 0.1387 -0.0500 0.2549 -0.0287 0.3724
0.3 -0.0162 0.1413 -0.0476 0.2631 -0.0670 0.3972 -0.0161 0.1423 -0.0510 0.2595 -0.0634 0.38930.7 0.0117 0.1480 -0.0432 0.2729 -0.0359 0.3975 -0.0020 0.1462 -0.0539 0.2667 -0.0464 0.40031 -0.0208 0.1273 -0.0542 0.2370 -0.0570 0.3352 -0.0221 0.1427 -0.0642 0.2609 -0.0663 0.3750
1.3 -0.1937 0.2314 -0.1935 0.2769 -0.1902 0.3459 -0.0138 0.1395 -0.0415 0.2568 -0.0423 0.34831.7 -0.5869 0.6103 -0.5540 0.5927 -0.5408 0.5979 -0.0053 0.1393 -0.0184 0.2256 -0.0153 0.32722 -0.9094 0.9256 -0.8703 0.8969 -0.8478 0.8851 -0.0146 0.1383 -0.0492 0.2547 -0.0352 0.3298
Panel B: m =�n0:65
�-0.3 0.0008 0.0771 -0.0151 0.1283 -0.0176 0.1807 0.0007 0.0777 -0.0125 0.1223 -0.0091 0.17060 -0.0017 0.0763 -0.0168 0.1279 -0.0203 0.1740 -0.0017 0.0763 -0.0167 0.1275 -0.0198 0.1723
0.3 -0.0075 0.0783 -0.0230 0.1309 -0.0232 0.1783 -0.0071 0.0793 -0.0226 0.1319 -0.0192 0.18620.7 0.0100 0.0806 0.0031 0.1359 0.0019 0.1849 -0.0037 0.0781 -0.0097 0.1329 -0.0133 0.18211 -0.0140 0.0691 -0.0205 0.1165 -0.0287 0.1598 -0.0160 0.0772 -0.0278 0.1309 -0.0315 0.1759
1.3 -0.2141 0.2344 -0.1961 0.2293 -0.1904 0.2405 -0.0128 0.0797 -0.0169 0.1266 -0.0136 0.18101.7 -0.6229 0.6382 -0.5882 0.6103 -0.5709 0.5986 -0.0158 0.0770 -0.0100 0.1224 -0.0043 0.16252 -0.9506 0.9581 -0.9177 0.9311 -0.8990 0.9165 -0.0186 0.0785 -0.0210 0.1282 -0.0206 0.1789
Panel C: m =�n0:8
�-0.3 0.0115 0.0448 -0.0021 0.0713 0.0004 0.0949 0.0115 0.0448 -0.0018 0.0705 0.0017 0.09070 -0.0040 0.0435 -0.0093 0.0709 -0.0107 0.0941 -0.0040 0.0435 -0.0093 0.0709 -0.0107 0.0941
0.3 -0.0093 0.0446 -0.0067 0.0711 -0.0067 0.0952 -0.0093 0.0446 -0.0067 0.0711 -0.0068 0.09500.7 -0.0147 0.0510 0.0079 0.0738 0.0083 0.0977 -0.0280 0.0537 -0.0075 0.0708 -0.0077 0.09171 -0.0363 0.0531 -0.0065 0.0603 -0.0082 0.0817 -0.0378 0.0574 -0.0090 0.0699 -0.0115 0.0933
1.3 -0.2528 0.2662 -0.2052 0.2261 -0.2001 0.2257 -0.0472 0.0644 -0.0085 0.0703 -0.0111 0.09441.7 -0.6847 0.6918 -0.6315 0.6431 -0.6187 0.6327 -0.0595 0.0734 -0.0007 0.0690 -0.0014 0.09122 -0.9950 1.0001 -0.9439 0.9523 -0.9308 0.9417 -0.0746 0.0876 -0.0129 0.0703 -0.0146 0.0954
Notes: LW, LPW, ExtLW, and ExtLPW denotes the local Whittle estimator of Robinson (1995a),
local polynomial Whittle estimator of Andrews & Sun (2004), extended local Whittle estimator of
Abadir et al. (2007), and our proposed estimator the extended local polynomial Whittle estimator,
respectively. r denotes the degree of parameterization of the polynomial, i.e. Pr =Pr�=1 ���
2�j .
53
Table 2: Simulation results for ARFIMA(0,d,1) with � = �0:8 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)
d Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE
Panel A: m =�n0:5
�-0.3 -0.1159 0.1940 -0.0132 0.2611 0.0077 0.3638 -0.1184 0.1950 -0.0054 0.2493 0.0192 0.35090 -0.1669 0.2168 -0.0859 0.2771 -0.0816 0.3852 -0.1666 0.2167 -0.0801 0.2630 -0.0694 0.34170.3 -0.1606 0.2151 -0.0842 0.2659 -0.0609 0.3838 -0.1604 0.2156 -0.0944 0.2531 -0.0842 0.35580.7 -0.1444 0.2101 -0.0565 0.2659 -0.0412 0.3699 -0.0826 0.1932 0.0003 0.2822 0.0053 0.39461 -0.1275 0.1866 -0.0739 0.2553 -0.0611 0.3594 -0.1654 0.2164 -0.0879 0.2674 -0.0636 0.37421.3 -0.2494 0.2746 -0.2099 0.2973 -0.2036 0.3608 -0.1727 0.2248 -0.1027 0.2585 -0.0998 0.33981.7 -0.6037 0.6207 -0.5666 0.6007 -0.5428 0.6014 -0.0851 0.1889 -0.0074 0.2669 0.0142 0.34002 -0.9138 0.9276 -0.8721 0.8978 -0.8456 0.8858 -0.1677 0.2188 -0.0775 0.2573 -0.0470 0.3369Panel B: m =
�n0:65
�-0.3 -0.3216 0.3365 -0.1284 0.1915 -0.0467 0.1859 -0.3378 0.3505 -0.1342 0.1972 -0.0463 0.18460 -0.3603 0.3699 -0.1773 0.2215 -0.1025 0.2091 -0.3580 0.3682 -0.1769 0.2210 -0.0999 0.20390.3 -0.3697 0.3794 -0.1875 0.2274 -0.1039 0.2030 -0.3697 0.3794 -0.1875 0.2274 -0.1041 0.20330.7 -0.3467 0.3586 -0.1609 0.2115 -0.0774 0.1976 -0.3388 0.3541 -0.1262 0.2098 -0.0325 0.21561 -0.3056 0.3225 -0.1326 0.1854 -0.0645 0.1754 -0.3653 0.3741 -0.1816 0.2215 -0.0984 0.19911.3 -0.3311 0.3372 -0.2439 0.2644 -0.2084 0.2545 -0.3748 0.3844 -0.1850 0.2271 -0.1040 0.20321.7 -0.6457 0.6523 -0.6045 0.6209 -0.5822 0.6075 -0.3493 0.3623 -0.1951 0.2299 -0.1501 0.22532 -0.9516 0.9581 -0.9168 0.9301 -0.8965 0.9153 -0.3674 0.3756 -0.1834 0.2245 -0.1026 0.2009Panel C: m =
�n0:8
�-0.3 -0.4946 0.4996 -0.3523 0.3650 -0.2386 0.2636 -0.5094 0.5133 -0.3693 0.3797 -0.2290 0.25490 -0.5333 0.5362 -0.3930 0.4006 -0.2866 0.3028 -0.4455 0.4739 -0.3889 0.3958 -0.2859 0.30150.3 -0.5469 0.5500 -0.3992 0.4073 -0.2960 0.3128 -0.5469 0.5500 -0.3992 0.4073 -0.2960 0.31280.7 -0.5359 0.5397 -0.3712 0.3805 -0.2614 0.2808 -0.5359 0.5397 -0.3696 0.3798 -0.2183 0.25561 -0.4980 0.5046 -0.3223 0.3371 -0.2216 0.2462 -0.5145 0.5167 -0.3855 0.3920 -0.2836 0.29741.3 -0.4657 0.4730 -0.3425 0.3493 -0.2900 0.2988 -0.5739 0.5768 -0.3986 0.4065 -0.2928 0.30811.7 -0.7071 0.7086 -0.6516 0.6563 -0.6341 0.6427 -0.5625 0.5665 -0.3740 0.3834 -0.2601 0.28252 -1.0010 1.0035 -0.9502 0.9555 -0.9364 0.9444 -0.5301 0.5336 -0.3890 0.3962 -0.3080 0.3309
Notes: LW, LPW, ExtLW, and ExtLPW denotes the local Whittle estimator of Robinson (1995a),
local polynomial Whittle estimator of Andrews & Sun (2004), extended local Whittle estimator of
Abadir et al. (2007), and our proposed estimator the extended local polynomial Whittle estimator,
respectively. r denotes the degree of parameterization of the polynomial, i.e. Pr =Pr�=1 ���
2�j .
54
Table 3: Simulation results for ARFIMA(0,d,1) with � = �0:5 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)
d Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE
Panel A: m =�n0:5
�-0.3 -0.0293 0.1431 -0.0478 0.2630 -0.0522 0.3787 -0.0291 0.1429 -0.0364 0.2503 -0.0240 0.35370 -0.0365 0.1509 -0.0553 0.2649 -0.0646 0.3864 -0.0364 0.1505 -0.0496 0.2591 -0.0437 0.3666
0.3 -0.0384 0.1527 -0.0485 0.2676 -0.0506 0.3782 -0.0402 0.1505 -0.0578 0.2564 -0.0580 0.36450.7 -0.0154 0.1448 -0.0278 0.2758 -0.0470 0.3862 -0.0191 0.1450 -0.0355 0.2747 -0.0500 0.39261 -0.0329 0.1306 -0.0441 0.2481 -0.0453 0.3532 -0.0406 0.1476 -0.0509 0.2647 -0.0428 0.3865
1.3 -0.2058 0.2402 -0.2073 0.2842 -0.2119 0.3359 -0.0422 0.1496 -0.0615 0.2542 -0.0711 0.33411.7 -0.5915 0.6136 -0.5554 0.5931 -0.5365 0.5968 -0.0186 0.1394 -0.0027 0.2247 0.0105 0.33612 -0.9134 0.9269 -0.8732 0.8968 -0.8480 0.8828 -0.0383 0.1435 -0.0383 0.2535 -0.0236 0.3273
Panel B: m =�n0:65
�-0.3 -0.0956 0.1225 -0.0233 0.1318 -0.0114 0.1763 -0.0962 0.1237 -0.0223 0.1291 -0.0074 0.16950 -0.1028 0.1288 -0.0365 0.1352 -0.0320 0.1733 -0.1028 0.1288 -0.0365 0.1349 -0.0313 0.1710
0.3 -0.1014 0.1269 -0.0334 0.1326 -0.0249 0.1802 -0.1014 0.1269 -0.0335 0.1329 -0.0248 0.18180.7 -0.0901 0.1241 -0.0170 0.1363 -0.0087 0.1782 -0.0956 0.1294 -0.0215 0.1383 -0.0135 0.18021 -0.0827 0.1112 -0.0247 0.1236 -0.0207 0.1654 -0.1082 0.1317 -0.0294 0.1280 -0.0216 0.1737
1.3 -0.2358 0.2479 -0.2047 0.2371 -0.1989 0.2462 -0.1135 0.1372 -0.0371 0.1390 -0.0351 0.18151.7 -0.6342 0.6452 -0.5990 0.6177 -0.5806 0.6057 -0.0969 0.1270 -0.0147 0.1290 -0.0031 0.16802 -0.9528 0.9590 -0.9192 0.9315 -0.8992 0.9165 -0.1168 0.1402 -0.0360 0.1377 -0.0272 0.1751
Panel C: m =�n0:8
�-0.3 -0.2392 0.2441 -0.1047 0.1262 -0.0386 0.0975 -0.2278 0.2328 -0.1029 0.1232 -0.0374 0.09470 -0.2582 0.2629 -0.1142 0.1348 -0.0565 0.1099 -0.2582 0.2629 -0.1142 0.1348 -0.0565 0.1099
0.3 -0.2655 0.2696 -0.1117 0.1318 -0.0493 0.1037 -0.2655 0.2696 -0.1117 0.1318 -0.0493 0.10370.7 -0.2621 0.2679 -0.0966 0.1223 -0.0314 0.1042 -0.2583 0.2644 -0.1023 0.1254 -0.0356 0.10291 -0.2287 0.2390 -0.0781 0.1084 -0.0312 0.0916 -0.2852 0.2895 -0.1118 0.1324 -0.0486 0.1037
1.3 -0.3183 0.3200 -0.2354 0.2468 -0.2146 0.2359 -0.2966 0.3006 -0.1146 0.1369 -0.0530 0.11011.7 -0.6865 0.6913 -0.6289 0.6394 -0.6128 0.6278 -0.2818 0.2884 -0.1125 0.1388 -0.0490 0.11482 -0.9998 1.0036 -0.9493 0.9567 -0.9370 0.9468 -0.3191 0.3229 -0.1169 0.1359 -0.0543 0.1058
Notes: LW, LPW, ExtLW, and ExtLPW denotes the local Whittle estimator of Robinson (1995a),
local polynomial Whittle estimator of Andrews & Sun (2004), extended local Whittle estimator of
Abadir et al. (2007), and our proposed estimator the extended local polynomial Whittle estimator,
respectively. r denotes the degree of parameterization of the polynomial, i.e. Pr =Pr�=1 ���
2�j .
55
Table 4: Simulation results for ARFIMA(1,d,0) with � = 0:8 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)
d Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE
Panel A: m =�n0:5
�-0.3 0.1530 0.2114 -0.0133 0.2535 -0.0323 0.3805 0.1530 0.2114 -0.0012 0.2475 0.0080 0.38910 0.1404 0.2015 -0.0290 0.2553 -0.0572 0.3771 0.1413 0.2034 -0.0248 0.2552 -0.0260 0.3647
0.3 0.1436 0.2030 -0.0266 0.2505 -0.0578 0.3806 0.1477 0.2088 -0.0237 0.2511 -0.0479 0.37330.7 0.1518 0.2117 0.0021 0.2495 -0.0041 0.3575 0.1421 0.2039 -0.0129 0.2413 -0.0126 0.36711 0.0751 0.1567 -0.0265 0.2323 -0.0582 0.3499 0.1385 0.2036 -0.0201 0.2608 -0.0431 0.3671
1.3 -0.1604 0.2210 -0.1813 0.2621 -0.1960 0.3381 0.1511 0.2108 0.0029 0.2683 -0.0195 0.37201.7 -0.5879 0.6133 -0.5554 0.5946 -0.5425 0.6058 0.1383 0.1978 0.0115 0.2225 0.0040 0.33202 -0.9022 0.9199 -0.8565 0.8878 -0.8322 0.8761 0.1407 0.2023 0.0088 0.2670 0.0192 0.3579
Panel B: m =�n0:65
�-0.3 0.4058 0.4144 0.1595 0.2037 0.0584 0.1855 0.4074 0.4171 0.1597 0.2036 0.0624 0.17910 0.4053 0.4138 0.1634 0.2103 0.0602 0.1916 0.4163 0.4276 0.1637 0.2112 0.0609 0.1937
0.3 0.4026 0.4106 0.1597 0.2078 0.0658 0.1873 0.4114 0.4195 0.1692 0.2216 0.0722 0.19830.7 0.3729 0.3816 0.1671 0.2121 0.0766 0.1898 0.3962 0.4049 0.1565 0.2064 0.0609 0.18531 0.1873 0.2329 0.0985 0.1590 0.0338 0.1653 0.4045 0.4154 0.1732 0.2295 0.0707 0.2103
1.3 -0.1653 0.2412 -0.1598 0.2219 -0.1681 0.2314 0.4058 0.4139 0.1935 0.2335 0.1281 0.22011.7 -0.6165 0.6385 -0.5792 0.6080 -0.5606 0.5938 0.3895 0.3988 0.1600 0.2082 0.0842 0.22172 -0.9361 0.9489 -0.8980 0.9193 -0.8765 0.9034 0.3864 0.3940 0.2500 0.3172 0.2115 0.3369
Panel C: m =�n0:8
�-0.3 0.6635 0.6655 0.4595 0.4665 0.3122 0.3279 0.6687 0.6713 0.4595 0.4665 0.3122 0.32790 0.6490 0.6510 0.4583 0.4648 0.3058 0.3202 0.6603 0.6623 0.4641 0.4721 0.3061 0.3209
0.3 0.6367 0.6386 0.4640 0.4701 0.3132 0.3278 0.6406 0.6425 0.4731 0.4791 0.3226 0.33840.7 0.5339 0.5415 0.4318 0.4393 0.3026 0.3175 0.6209 0.6232 0.4649 0.4713 0.3088 0.32441 0.1922 0.2685 0.2135 0.2625 0.1693 0.2101 0.6120 0.6143 0.4876 0.4958 0.3438 0.3641
1.3 -0.2298 0.2829 -0.1696 0.2415 -0.1630 0.2294 0.5840 0.5863 0.4694 0.4757 0.3290 0.34341.7 -0.6783 0.6883 -0.6187 0.6384 -0.6017 0.6267 0.5514 0.5547 0.4572 0.4637 0.3158 0.33782 -1.0016 1.0052 -0.9513 0.9584 -0.9386 0.9486 0.4845 0.4854 0.4534 0.4568 0.4315 0.4466
Notes: LW, LPW, ExtLW, and ExtLPW denotes the local Whittle estimator of Robinson (1995a),
local polynomial Whittle estimator of Andrews & Sun (2004), extended local Whittle estimator of
Abadir et al. (2007), and our proposed estimator the extended local polynomial Whittle estimator,
respectively. r denotes the degree of parameterization of the polynomial, i.e. Pr =Pr�=1 ���
2�j .
56
Table 5: Simulation results for ARFIMA(1,d,0) with � = 0:5 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)
d Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE
Panel A: m =�n0:5
�-0.3 0.0108 0.1404 -0.0426 0.2469 -0.0382 0.3678 0.0123 0.1369 -0.0305 0.2400 -0.0107 0.37740 0.0088 0.1441 -0.0598 0.2682 -0.0818 0.3961 0.0086 0.1448 -0.0534 0.2642 -0.0529 0.3668
0.3 0.0167 0.1470 -0.0621 0.2660 -0.0754 0.4006 0.0174 0.1501 -0.0616 0.2690 -0.0690 0.3854.7 0.0295 0.1451 -0.0210 0.2485 -0.0165 0.3702 0.0131 0.1389 -0.0350 0.2460 -0.0219 0.38641 0.0104 0.1251 -0.0399 0.2309 -0.0464 0.3604 0.0159 0.1391 -0.0450 0.2528 -0.0415 0.4000
1.3 -0.1915 0.2349 -0.2109 0.2887 -0.2113 0.3579 0.0043 0.1407 -0.0547 0.2594 -0.0395 0.36001.7 -0.5861 0.6113 -0.5580 0.5953 -0.5428 0.6002 0.0121 0.1422 -0.0217 0.2276 -0.0136 0.32122 -0.9062 0.9227 -0.8612 0.8891 -0.8321 0.8728 0.0094 0.1363 -0.0286 0.2457 -0.0047 0.3275
Panel B: m =�n0:65
�-0.3 0.0950 0.1221 -0.0070 0.1296 -0.0261 0.1750 0.0950 0.1221 -0.0052 0.1263 -0.0185 0.16240 0.0916 0.1194 -0.0064 0.1260 -0.0234 0.1734 0.0916 0.1194 -0.0064 0.1259 -0.0226 0.1707
0.3 0.0921 0.1191 -0.0060 0.1256 -0.0191 0.1741 0.0946 0.1240 -0.0049 0.1285 -0.0174 0.17760.7 0.1027 0.1300 0.0168 0.1320 0.0027 0.1756 0.0913 0.1208 -0.0014 0.1262 -0.0146 0.17121 0.0476 0.0872 -0.0110 0.1205 -0.0237 0.1710 0.0827 0.1130 -0.0104 0.1317 -0.0216 0.1833
1.3 -0.1954 0.2285 -0.1931 0.2309 -0.1906 0.2428 0.0859 0.1181 0.0065 0.1407 -0.0026 0.18921.7 -0.6226 0.6379 -0.5876 0.6102 -0.5697 0.5974 0.0808 0.1137 -0.0012 0.1286 -0.0093 0.16652 -0.9521 0.9602 -0.9200 0.9340 -0.9012 0.9197 0.0823 0.1135 -0.0068 0.1350 -0.0134 0.1877
Panel C: m =�n0:8
�-0.3 0.3043 0.3081 0.1112 0.1331 0.0402 0.1017 0.3043 0.3081 0.1112 0.1331 0.0404 0.10120 0.2897 0.2937 0.1054 0.1274 0.0330 0.1019 0.2897 0.2937 0.1054 0.1274 0.0330 0.1019
0.3 0.2804 0.2844 0.1064 0.1273 0.0364 0.0988 0.2892 0.2935 0.1072 0.1293 0.0364 0.09890.7 0.2588 0.2632 0.1173 0.1385 0.0484 0.1105 0.2631 0.2673 0.1081 0.1292 0.0360 0.10021 0.1107 0.1465 0.0676 0.0966 0.0255 0.0864 0.2497 0.2542 0.1088 0.1323 0.0366 0.1014
1.3 -0.2351 0.2676 -0.1952 0.2263 -0.1958 0.2240 0.2460 0.2512 0.1322 0.1574 0.0644 0.12581.7 -0.6751 0.6867 -0.6195 0.6366 -0.6049 0.6256 0.2178 0.2232 0.1103 0.1317 0.0378 0.10102 -0.9966 1.0010 -0.9449 0.9528 -0.9317 0.9419 0.1978 0.2041 0.1068 0.1293 0.0332 0.1030
Notes: LW, LPW, ExtLW, and ExtLPW denotes the local Whittle estimator of Robinson (1995a),
local polynomial Whittle estimator of Andrews & Sun (2004), extended local Whittle estimator of
Abadir et al. (2007), and our proposed estimator the extended local polynomial Whittle estimator,
respectively. r denotes the degree of parameterization of the polynomial, i.e. Pr =Pr�=1 ���
2�j .
57
0 300 600 900 1200 1500 1800 2100 2400 2700 3000 3300 3600 3900
1.75
2.00
2.25
2.50Panel A: Aaa Baa Treas
0 300 600 900 1200 1500 1800 2100 2400 2700 3000 3300 3600 3900
0.1
0.2
0.3
0.4
0.5 Panel B: sBaaAaa sBaaTreas sAaaTreas
Figure 1: Time series plot of log yields (Panel A) and their respective spreads (Panel B).
58
Panel A
0,6
0,7
0,8
0,9
1
1,1
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
LW LPW LW+/2s.e. LPW+/2s.e.
Panel B
0,7
0,8
0,9
1
1,1
1,2
1,3
1,4
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.
Figure 2: Estimated long memory of log Aaa yield for bandwidth equal to 50 through 2000.
59
Panel A
0,7
0,8
0,9
1
1,1
1,2
1,3
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
LW LPW LW+/2s.e. LPW+/2s.e.
Panel B
0,7
0,8
0,9
1
1,1
1,2
1,3
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.
Figure 3: Estimated long memory of log Baa yield for bandwidth equal to 50 through 2000.
60
Panel A
0,6
0,7
0,8
0,9
1
1,1
1,2
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
LW LPW LW+/2s.e. LPW+/2s.e.
Panel B
0,6
0,7
0,8
0,9
1
1,1
1,2
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.
Figure 4: Estimated long memory of log Treasury yield for bandwidth equal to 50 through 2000.
61
Panel A
0,5
0,6
0,7
0,8
0,9
1
1,1
1,2
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
LW LPW LW+/2s.e. LPW+/2s.e.
Panel B
0,5
0,6
0,7
0,8
0,9
1
1,1
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.
Figure 5: Estimated long memory of Aaa spread over Treasury yield for bandwidth equal to 50 through
2000.
62
Panel A
0,6
0,7
0,8
0,9
1
1,1
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
LW LPW LW+/2s.e. LPW+/2s.e.
Panel B
0,6
0,7
0,8
0,9
1
1,1
1,2
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.
Figure 6: Estimated long memory of Baa spread over Treasury yield for bandwidth equal to 50 through
2000.
63
Panel A
0,5
0,6
0,7
0,8
0,9
1
1,1
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
LW LPW LW+/2s.e. LPW+/2s.e.
Panel B
0,5
0,6
0,7
0,8
0,9
1
1,1
1,2
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.
Figure 7: Estimated long memory of Baa spread over Aaa yield for bandwidth equal to 50 through
2000.
64
Chapter Chapter Chapter Chapter 3333
Local polynomial Whittle estimation of perturbed fractional processes
Local polynomial Whittle estimation of perturbed fractional
processes�
Per FrederiksenEquity Trading & Derivatives
Nordea Markets
Frank S. Nielseny
Aarhus University and CREATES
Morten Ørregaard NielsenQueen�s University and CREATES
March 20, 2009
Abstract
We propose a semiparametric local polynomial Whittle with noise (LPWN) estimator of the memory
parameter in long memory time series perturbed by a noise term which may be serially correlated.
The estimator approximates the spectrum of the perturbation as well as that of the short-memory
component of the signal by two separate polynomials. Including these polynomials we obtain a
reduction in the order of magnitude of the bias, but also in�ate the asymptotic variance of the
long memory estimate by a multiplicative constant. We show that the estimator is consistent for
d 2 (0; 1), asymptotically normal for d 2 (0; 3=4), and if the spectral density is in�nitely smoothnear frequency zero, the rate of convergence can become arbitrarily close to the parametric rate,pn. A Monte Carlo study reveals that the LPWN estimator performs well in the presence of
a serially correlated perturbation term. Furthermore, an empirical investigation of the 30 DJIA
stocks shows that this estimator indicates stronger persistence in volatility than the standard local
Whittle estimator.
JEL Classi�cations: C22.
Keywords: Bias reduction, local Whittle, long memory, perturbed fractional process, semiparamet-
ric estimation, stochastic volatility.
�We are grateful to Torben G. Andersen, Jörg Breitung, Niels Haldrup, Esben Høg, Asger Lunde, and the participants
at SETA 2008 for valuable suggestions and comments. This work was partly done while P. Frederiksen was visiting
Northwestern University, F. S. Nielsen was visiting Cornell University, and M. Ø. Nielsen was visiting Queen�s University
and the University of Aarhus; their hospitality is gratefully acknowledged. We are grateful for �nancial support from
the Danish Social Sciences Research Council (grant no. FSE 275-05-0220) and the Center for Research in Econometric
Analysis of Time Series (CREATES, funded by the Danish National Research Foundation).yPlease address correspondence to: Frank S. Nielsen, School of Economics and Management, Aarhus University,
Universitetsparken building 1322, 8000 Aarhus, Denmark; phone: +45 8942 5419; e-mail: [email protected]
67
1 Introduction
We are interested in estimation of the memory parameter in a so-called perturbed fractional process,
zt = yt + wt; (1)
i.e. a signal-plus-noise model where the signal process yt is a long memory process with memory
parameter d which is perturbed by the additive noise term wt. These processes have found extensive
use in modeling the long memory characteristics of many observed time series. In particular, they
are a version of the random walk plus noise or local level unobserved components model, e.g. Harvey
(1989), except the signal is a long memory process rather than a random walk.
Another motivation for the perturbed fractional process is the version of the long memory stochastic
volatility (LMSV) model for �nancial returns proposed by Bollerslev & Jubinski (1999),
rt = �peyt+xtut; (2)
where rt denotes the return, yt is the (long memory component of) log-volatility of the returns, xt is a
short-memory process, and yt, xt, and ut are independent to satisfy the requirement that E (rt) = 0.
This generalizes the usual LMSV model introduced by Breidt, Crato & de Lima (1998) and Harvey
(1998),
rt = �peytut; (3)
by arguing that allowing for di¤erent short-lived news impacts, while imposing a common long mem-
ory component, may provide a better characterization of the joint volume-volatility relationship in
the context of the Mixture of Distributions Hypothesis, which asserts that stock returns and trading
volumes are jointly dependent on the same underlying latent information arrival process. The formu-
lation in (2) allows the volatility to be a¤ected by both long and short-lived news impacts, which is
also consistent with the �ndings of Liesenfeld (2001). It therefore seems natural that an estimator of
the memory in log r2t should be able to incorporate both (2) and (3).
The LMSV models (2) and (3) imply that a logarithmic transformation of the squared returns
series log r2t becomes a long memory signal-plus-noise process (1) where the signal yt corresponds to
(the long memory component of) the log-volatility of the original returns series and wt is an additive
noise term. In the context of the LMSV model (3), wt is usually assumed to be i:i:d:, but to allow for
short-memory persistence in wt as implied by (2) we will not make that restriction here. In general,
when wt is not assumed to be i:i:d:, zt is referred to as a perturbed fractional process.1 For reviews of
fractionally integrated processes and some applications, see Baillie (1996), Henry & Za¤aroni (2003), or
Robinson (2003). In particular, long memory in volatility has received considerable interest recently.2
If we assume that the log-volatility process fytg and the noise process fwtg are independent, thespectral density of zt can be written as
fz (�) = ��2d�y (�) + �w(�) = �
�2dG
��y(�)
�y(0)+ �2d
�w(�)
�y(0)
�; (4)
1 In the following we use the terms �long memory process� and �fractionally integrated process� or just �fractional
process�synonymously, although strictly speaking a fractional process is just a particular form of a long memory process.2See, e.g., Ding, Granger & Engle (1993), Baillie, Bollerslev & Mikkelsen (1996), Comte & Renault (1998), Ray &
Tsay (2000), Andersen, Bollerslev, Diebold & Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001, 2003), Wright
(2002), Hurvich & Ray (2003), and Arteche (2004) among others.
68
where fy(�) = ��2d�y (�) is the spectrum of the signal yt, �w(�) is the spectrum of the noise term wt,
and d is the degree of long memory in yt (or equivalently in zt).
The assumption of independence between the processes fytg and fwtg rules out the so-calledleverage e¤ect. This assumption is common in the so-called random walk plus noise unobserved
components models, and has also been imposed by Breidt et al. (1998), Deo & Hurvich (2001), and
Arteche (2004), among others, in the LMSV model. To accommodate the leverage e¤ect, we could
allow contemporaneous correlation, while the return process remains a martingale di¤erence sequence
by replacing yt with yt�1 in (2). An additional assumption of distributional symmetry around (0; 0)
would imply that the spectral density decomposition in (4) holds, see Hurvich, Moulines & Soulier
(2005). Alternatively, the model could be modi�ed along the lines of model (P2) of Hurvich et al.
(2005).
In semiparametric spectral estimation of long memory models, the spectrum (4) is typically ap-
proximated using the periodogram of the data near the zero frequency, i.e. for frequencies up to
�m = 2�m=n only, where n is the sample size and m is a user-chosen bandwidth number, see sections
2 and 3 below, which tends to in�nity slower than n such that �m ! 0. Although the popular log-
periodogram regression (LPR) estimator of Geweke & Porter-Hudak (1983) and Robinson (1995b) and
the local Whittle (LW) estimator of Künsch (1987) and Robinson (1995a) both preserve consistency
and asymptotic normality when applied to perturbed fractional processes, as shown recently by Deo
& Hurvich (2001) and Arteche (2004), these estimators can be severely biased since they do not take
the perturbation into account. Indeed, for non-perturbed processes (where �w(�) = 0) the bias of
the standard semiparametric frequency domain estimators is of order O(�2m), whereas the leading bias
term when �w(�) 6= 0 is of order O(�2dm ). As shown in Deo & Hurvich (2001) and Arteche (2004), thisbias is typically negative and can be very large (note that d < 1). Therefore, estimating long memory
in perturbed time series can be a challenging task, and calls for an estimator which explicitly accounts
for the perturbation.
Sun & Phillips (2003), Hurvich & Ray (2003), and Hurvich et al. (2005) have proposed such
estimators with �y(�) and �w(�) approximated by constants as � ! 0, see section 2 below. On the
other hand, we propose an estimator where we allow both the spectrum of the perturbation and the
spectrum of the short-memory component of the signal, i.e. �w(�) and �y(�), to be approximated by
polynomials hw(�w; �) and hy(�y; �) of (�nite and even) orders 2Rw and 2Ry near the zero frequency,
instead of constants, thereby obtaining a bias reduction depending on the smoothness of �w(�) and
�y(�) near the origin. The approach taken here in modeling the short-run dynamics by a polynomial
was introduced by Andrews & Sun (2004) for non-perturbed processes, but is novel in the context
of perturbed fractional processes. To maintain generality, �w(�) and �y(�) are only characterized by
regularity conditions near frequency zero instead of imposing speci�c functional forms.
The LMSV model (3) often assumes that the noise term is i:i:d: in which case �w(�) = �2w=(2�) is
a constant. This case is of independent interest and is considered in simulations and in an empirical
study in Frederiksen & Nielsen (2008). In that paper �y(�) is approximated by a polynomial and
�w(�) by a constant as �! 0 thus focusing on exactly the LMSV model (3). However, the theory for
their estimator is developed in the present paper.
Thus, to allow serial dependence in the noise as in (2) above we include both polynomials, hy(�y; �)
69
and hw(�w; �). Furthermore, empirical studies have typically found that the noise term has much
higher (long-run) variance than the short-memory component of the signal. Indeed, Breidt et al.
(1998) and Hurvich & Ray (2003) �nd that the noise term may be as much as 10 or 20 times as
variable as the short-memory component of the signal. Thus, careful modeling of the noise term is
important and this consideration has lead us to approximate the spectrum of the noise term by a
polynomial instead of a constant as �! 0.
Our results show that introducing hy(�y; �) and hw(�w; �) in�ates the asymptotic variance of
the long memory estimator, d, by a multiplicative constant which depends on the true long memory
parameter, d. However, the in�ation decreases when d increases, and we obtain a reduction in the
order of magnitude of the bias if �(�) is su¢ ciently smooth near frequency zero. We show that the
estimator is consistent for d 2 (0; 1), asymptotically normal for d 2 (0; 3=4), and if �(�) is in�nitelysmooth near frequency zero, the rate of convergence can become arbitrary close to the parametric
rate, n1=2. This constitutes a rate of convergence improvement relative to Sun & Phillips (2003),
Hurvich & Ray (2003), and Hurvich et al. (2005) who are only able to obtain a semiparametric rate
of convergence m1=2, which is much slower than the parametric rate due to the minimal requirement
that m=n! 0.
We present the results of a Monte Carlo study which shows the usefulness of the proposed LPWN
estimator. Compared to standard estimators, such as Hurvich & Ray�s (2003) local Whittle with noise
(LWN) estimator, the LPWN estimator is able to achieve considerable bias reductions in practice, es-
pecially in cases with short-run dynamics in both the signal and noise components. We also include an
empirical application to the 30 DJIA stocks where the LPWN estimator indicates stronger persistence
in volatility than the standard estimators, and for most of the stocks produce estimates of d in the
nonstationary region.
The remainder of the paper is organized as follows. In the next section we discuss semiparametric
spectral estimation of long memory for perturbed processes and formally de�ne the proposed local
Whittle estimator. In section 3 we establish consistency and asymptotic normality of the estimator.
Section 4 investigates the �nite sample performance in simulations, and section 5 presents an empirical
study of daily log-squared returns series of the 30 DJIA stocks. Section 6 concludes. The proofs of
our theorems are gathered in the appendix.
2 Local Whittle estimation of perturbed fractional processes
Semiparametric frequency-domain estimators are essentially based on the local approximation
fz (�) � G��2d as �! 0; (5)
where G is a constant and the symbol ���means that the ratio of the left and right hand sides tendsto one in the limit. Thus, the estimators enjoy robustness to short-run dynamics, since they use only
information from periodogram ordinates in the vicinity of the origin.
The local Whittle (LW) estimation method by Künsch (1987) and Robinson (1995a) has become
popular because of its likelihood interpretation, nice asymptotic properties, and mild assumptions. It
70
is de�ned as the minimizer of the (negative) local Whittle likelihood function
Q (G; d) =1
m
mXj=1
"log�G��2dj
�+Iz (�j)
G��2dj
#; (6)
where m = m(n) is a bandwidth number which tends to in�nity as n!1 but at a slower rate than
n, �j = 2�j=n are the Fourier frequencies, and Iz(�) = (2�n)�1jPnt=1 zte
it�j2 is the periodogram of
zt. Note that the estimator is invariant to a non-zero mean since j = 0 is left out of the minimization.
Concentrating (6) with respect to G, the estimator of d is
dLW = argmind
24log G(d)� 2d 1m
mXj=1
log �j
35 ; G(d) =1
m
mXj=1
�2dj Iz (�j) :
It was shown by Robinson (1995a) that
pm(dLW � d) d! N(0; 1=4); (7)
and later by Velasco (1999) that the range of consistency is d 2 (�1=2; 1] and the range of asymptoticnormality is d 2 (�1=2; 3=4).
To reduce the asymptotic bias of the standard LW estimator, Andrews & Sun (2004) have suggested
to replace the constant, logG, in (6) by the polynomial �0 �PRr=1 �r�
2rj . That is, by modeling the
logarithm of the spectral density of the short-run component by a polynomial instead of a constant in
the vicinity of the origin. This leads to the following (negative) likelihood function,
Q (G; d;�) =1
m
mXj=1
24log ��2dj G exp
�
RXr=1
�r�2rj
!!+
Iz (�j)
��2dj G exp��PRr=1 �r�
2rj
�35 ;
such that
(dLPW ; �) = argmind2(�1=2;1=2);�2�
24log G(d;�)� 2d 1m
mXj=1
log �j �1
m
mXj=1
RXr=1
�r�2rj
35 ;G(d;�) =
1
m
mXj=1
�2dj exp
RXr=1
�r�2rj
!Iz (�j) ;
where � is a compact and convex set in RR. As shown by Andrews & Sun (2004), this method does,however, increase the asymptotic variance of d in (7) by a multiplicative constant.
For non-perturbed fractional processes, the asymptotic bias of dLW and dLPW is of order O(�2m)
and O(�minfs;2+2Rgm ), respectively, where s is a measure of the smoothness of the spectral density near
frequency zero, see below. However, for perturbed fractional processes the bias is of order O(�2dm ) and,
as shown by e.g. Hurvich & Ray (2003) and Arteche (2004), this bias is typically negative and can be
very severe.
For perturbed fractional processes we have the spectral representation (4) rather than (5). There
are two main consequences: �rst, the extra additive term in (4) needs to be taken into account to avoid
serious asymptotic bias as mentioned above, and second the rate of convergence of the estimators is
71
reduced if the extra term is not modeled. The latter follows because the choice of bandwidth parameter
is severely constrained for perturbed fractional processes when the perturbation term in (4) is not
modeled. Thus, for non-perturbed processes the bandwidth requirement is typically m = o(n4=5),
whereas for perturbed processes it is m = o(n2d=(1+2d)) (apart from logarithmic terms). Since d � 1and the estimator is
pm�consistent this is a serious constraint.
To allow for (moderate) nonstationarity in volatility we generalize (1) as
zt =
(yt + wtPts=1 xs + wt
if d 2 (0; 1=2) ;if d 2 [1=2; 1) ;
(8)
where, if d 2 [1=2; 1), xt has spectrum of the form fx(�) = ��2dx�x (�) with memory parameter
dx = d � 1. De�ning yt =Pts=1 xs if d 2 [1=2; 1), this approach allows zt = yt + wt to possibly be
nonstationary with memory parameter d 2 (0; 1). Velasco (1999), Hurvich & Ray (2003), and Hurvichet al. (2005) also assume this type of process. Since f
Pts=1 xsg is nonstationary3 zt does not have a
spectral density if d 2 [1=2; 1) but it has a pseudo spectral density, see e.g. Hurvich & Ray (1995) andVelasco (1999). Thus, we may de�ne
fz (�) =
(fy (�) + fw (�)��1� ei����2 fx (�) + fw (�) if d 2 (0; 1=2) ;
if d 2 [1=2; 1) ;
= ��2dG
��y(�)
�y(0)+ �2d
�w(�)
�y(0)
�; (9)
where we maintain the assumption of independence between fytg and fwtg.Taking (9) into account we propose to approximate (4) locally near the zero frequency by4
g (�) = ��2dG�1 + hy(�y; �) + �
2dhw(�w; �)�; (10)
where hy(�y; �) =PRyr=1 �y;r�
2r, hw(�w; �) =PRwr=0 �w;r�
2r. If Ry = 0 we set hy(�y; �) = 0. De�ning
also the polynomial h(d;�; �) =hy(�y; �) + �2dhw(�w; �) with � = (�0y;�0w)0 this yields the (concen-
trated) likelihood
Q (d;�) = log G (d;�) +1
m
mXj=1
log���2dj (1 + h(d;�; �j))
�; (11)
G (d;�) =1
m
mXj=1
�2dj Iz (�j)
1 + h(d;�; �j): (12)
Thus, we propose to minimize (11) over the admissible set D ��,
(d; �) = argmin(d;�)2D��
Q (d;�) ;
where � is a compact and convex set in RR+1, R = Ry +Rw, and D = [d1; d2] with 0 < d1 < d2 < 1.
We call this estimator the local polynomial Whittle with noise (LPWN) estimator.
3 In the nonstationary case, fPt
s=1 xsg is a type I fractional process in the terminology of Marinucci & Robinson
(1999).4Note that �y(�) and �w(�) are symmetric around � = 0 and are therefore approximated by even polynomials.
72
Note that h(�; �) = 0 is the standard local Whittle speci�cation in (6), which does not explicitly
account for the perturbation. For Ry = Rw = 0 we get h (�; �) = �, where �y(�) and �w(�) in (4) are
both modeled locally by constants. This is the local Whittle with noise (LWN) estimator of Hurvich
& Ray (2003) and Hurvich et al. (2005) (parameterization (P1)). Thus, our model parameterization
includes the standard LW estimator and the LWN estimator as special cases. Furthermore, the model
with Rw = 0, where the noise is modeled by a constant near the zero frequency, is analyzed empirically
and in simulations by Frederiksen & Nielsen (2008), using the asymptotic theory provided in this paper.
3 Asymptotic properties
In this section we �rst introduce the assumptions needed to establish consistency and asymptotic
normality of the proposed estimator for the perturbed fractional process, and consequently we present
the main results in two theorems. In the following, true values of the parameters are denoted by
subscript zero and bxc denotes the integer part of a real number x. We also de�ne a function �(�) to besmooth of order s at � = 0 if, in a neighborhood of � = 0, � (�) is bsc times continuously di¤erentiablewith bsc�derivative, �(bsc), satisfying j�(bsc) (�) � �(bsc) (0) j � C j�js�bsc for some constant C < 1.To simplify the presentation, we list only one set of assumptions even though these could be relaxed
somewhat for the consistency proof, see e.g. Hurvich et al. (2005).
A1 The noise process fwtg is independent of the signal process fytg.
A2 The spectral density of zt is fz (�) = ��2d0G0�y(�)
�y(0)+ �w(�), where �y (�) and �w(�) are real,
even, positive, continuous functions on [��; �) and d0 2 D = [d1; d2] with 0 < d1 < d2 < 1.
A3 The functions �y (�) and �w(�) are smooth of orders sy and sw at � = 0, where sy > 2Ry,
sw > 2Rw, and sy; sw � 1.
Assumption A1 is the independence assumption used above to write the spectral density of zt as the
sum of the (pseudo) spectral densities of yt and wt. Assumption A3 is a smoothness condition on the
functions �y (�) and �w(�) similar to that applied by Andrews & Sun (2004). Note that Assumption
A3 holds for all sy < 1 when, e.g., yt is a �nite order ARFIMA process, and for all sw < 1 when,
e.g., wt is a �nite order ARMA process. Under Assumption A3 we establish the following Taylor series
expansions of �y (�) and �w(�) around � = 0 (recall that odd order derivatives of even functions are
zero at frequency zero),
�y (�)
�y(0)= 1 +
bsy=2cXr=1
�y;r�2r +O (�sy) = 1 + hy(�y; �) +O(�
minfsy ;2+2Ryg) as �! 0;
and
�w (�)
�y(0)=�w (0)
�y(0)+
bsw=2cXr=1
�w;r�2r +O (�sw) = hw(�w; �) +O(�
minfsw;2+2Rwg) as �! 0;
73
where �y;r = 1(2r)!�y(0)
@2r
@�2r�y (�)
���=0
and �w;r+1 = 1(2r)!�y(0)
@2r
@�2r�w (�)j�=0. Hence, the approximation
(10) to (9) is
log (fz (�) =g (�)) = log
��y(�)
�y(0)+ �2d
�w(�)
�y(0)
�� log (1 + h(d;�,�))
= log
1 +
O(�minfsy ;2+2Ryg) + �2dO(�minfsw;2+2Rwg)
1 + h(d;�; �)
!as �! 0;
fz (�)
g (�)= 1 +O(�minfsy ;2+2Ryg) + �2dO(�minfsw;2+2Rwg) as �! 0; (13)
and the true values of G and � are G0 = �y (0) and �0 = (�0;1; :::; �0;R+1)0, where
�0;r =1
(2r)!�y(0)
@2r
@�2r�y (�)
���=0
; r = 1; : : : ; Ry;
�0;Ry+r+1 =1
(2r)!�y(0)
@2r
@�2r�w (�)j�=0 ; r = 0; : : : ; Rw:
A4 (a) The signal yt has zero mean and admits an in�nite order moving average representationyt =
P1j=0 �j"t�j (stationary case) or �yt = xt =
P1j=0 �j"t�j (nonstationary case), whereP1
j=0 �2j <1 and "t satis�es, for all t, E ("tj Ft�1) = 0, E
�"2t��Ft�1� = 1, E �"3t ��Ft�1� = �3 <
1, and E�"4t��Ft�1� = �4 <1 almost surely, where Ft�1 is the �-�eld generated by f"s; s < tg.
(b) There exists a random variable " with E("2) < 1 such that for all � > 0 and some K > 0,
P (j"tj > �) < KP (j"j > �).
(c) In a neighborhood of the origin, @@�� (�) = O (j� (�) j=�) as �! 0, where � (�) =
P1k=0 �ke
ik�.
A5 (a) The noise wt has zero mean and admits an in�nite order moving average representation wt =P1j=0 �j�t�j , where
P1j=0 �
2j <1 and �t satis�es, for all t, E (�tj Ft�1) = 0, E
��2t��Ft�1� = 1,
E��3t��Ft�1� = �3 < 1, and E
��4t��Ft�1� = �4 < 1 almost surely, where Ft�1 is the �-�eld
generated by f�s; s < tg.
(b) There exists a random variable " with E("2) < 1 such that for all � > 0 and some K > 0,
P (j�tj > �) < KP (j"j > �).
(c) In a neighborhood of the origin, @@�� (�) = O (j� (�) j=�) as �! 0, where � (�) =
P1k=0 �ke
ik�.
Since our estimator is a function of the periodogram at nonzero frequencies only, we assume without
loss of generality5 that the signal process yt has zero mean. Importantly, Assumptions A4 and A5
allow for non-Gaussian processes. Note that Assumptions A1-A4 plus the assumption that wt is white
noise with �nite fourth moment imply the assumptions needed on yt and wt to prove consistency and
asymptotic normality (if, in addition, d2 < 3=4) of the LWN estimator of Hurvich & Ray (2003). It
follows from Theorems 1 and 2 below that their results for the LWN estimator are also valid for our
more general assumptions on wt in Assumption A5.
5 In the nonstationary case the zero mean assumption implies that zt is free of linear trends which does entail a loss
of generality in that case.
74
A6 � is a compact and convex subset of RR+1 and �0 lies in the interior of �.
We are now ready to prove consistency of our estimator. As mentioned above, some of our as-
sumptions could be relaxed somewhat to prove this theorem, but we have preferred to list only one
set of assumptions which will be used also for the proof of asymptotic normality below. The proofs of
both theorems are given in the appendix.
Theorem 1 If Assumptions A1-A6 hold and the bandwidth m = m (n) is such that
1
m+m
n! 0; (14)
then d� d0 = oP ((log n)�5).
Note that the theorem proves consistency only for the estimator of the memory parameter (at
logarithmic rate). There is no proof of consistency for the estimators of the polynomial parameters in
�. The strategy of proof in Hurvich et al. (2005) would require next a separate proof of consistency of
the polynomial parameters, however, we follow instead the method of proof in Andrews & Sun (2004)
which does not require an intermediate result on the consistency of �. Thus, we give next the joint
asymptotic normality of d and �.
Theorem 2 Let Assumptions A1-A6 hold with d0 in the interior of D = [d1; d2], 0 < d1 < d2 < 3=4,
and suppose the bandwidth m = m (n) is such that
m1+4Ry
n4Ry+m1+4(d0+Rw)
n4(d0+Rw)!1 and
m2'y+1
n2'y+m2'w+4d0+1
n2'w+4d0! 0; (15)
where 'a = min fsa; 2 + 2Rag ; a = y; w. Then d and � are both consistent and
Bn
d� d0� � �0
!d! N(0;�1Ry ;Rw); Ry ;Rw =
0B@ 4 �0Ry � 0Rw�Ry �Ry 0Ry ;Rw�Rw Rw;Ry Rw
1CA ;where Bn = Bn (d0) is the (R+ 2)� (R+ 2) deterministic diagonal matrix with diagonal elements
(Bn)11 =pm, (Bn)k+1;k+1 =
pm�2km for k = 1; : : : ; Ry;
and (Bn)k+Ry+2;k+Ry+2 =pm�2d0+2km for k = 0; : : : ; Rw;
�Ry and �Rw = �Rw(d0) are the vectors
(�Ry)k =�4k
(1 + 2k)2for k = 1; : : : ; Ry and (�Rw)k+1 =
�4(d0 + k)(1 + 2d0 + 2k)
2 for k = 0; : : : ; Rw;
�Ry and Rw = Rw(d0) are the Ry �Ry and (Rw + 1)� (Rw + 1) matrices��Ry
�ik
=4ik
(1 + 2i+ 2k) (1 + 2i) (1 + 2k)for i; k = 1; : : : ; Ry;
(Rw)i+1;k+1 =4(d0 + i)(d0 + k)
(1 + 2i+ 2k + 4d0) (1 + 2i+ 2d0) (1 + 2k + 2d0)for i; k = 0; : : : ; Rw;
75
and Rw;Ry = Rw;Ry(d0) is the (Rw + 1)�Ry matrix
( Rw;Ry)i+1;k =4k(d0 + i)
(1 + 2d0 + 2k + 2i) (1 + 2d0 + 2i) (1 + 2k)for i = 0; : : : Rw; k = 1; : : : ; Ry:
If Ry = Rw = 0 de�ne 0;0 =
4 � 00�0 0
!.
First of all, we note that by setting Ry = Rw = 0 we obtain as a special case the results for the
LWN estimator of Hurvich & Ray (2003). Secondly, the leading (Ry + 1) � (Ry + 1) submatrix ofRy ;Rw is the same as that obtained by Andrews & Sun (2004). Third, we note that the asymptotic
variance ofpm(d�d0) is free of the polynomial parameters �0, but it depends on d0. Moreover, the use
of the polynomials hy(�y; �) and hw(�w; �) increases the asymptotic variance of d by a multiplicative
constant compared to LWN estimator of Hurvich & Ray (2003) (easily seen by use of the formula
for the inverse of a partitioned matrix). Andrews & Sun (2004) obtain a similar result for their local
polynomial Whittle (LPW) estimator in a non-volatility model.
The �rst condition in (15) guarantees that all the elements of the scaling matrix Bn diverge as
n ! 1, which is a minimal condition for consistency. The second condition restricts the expansionrate of the bandwidth to control bias and ensure that the estimator uses only relevant information from
periodogram ordinates su¢ ciently near the zero frequency. Alternatively, we can view the bandwidth
conditions in (15) separately for the signal process and the noise process. In this way we would write
the conditions as
m1+4Ry
n4Ry!1; m
2'y+1
n2'y! 0 and
m1+4(d0+Rw)
n4(d0+Rw)!1; m
2'w+4d0+1
n2'w+4d0! 0:
It is now easy to see that the bandwidth conditions for both the signal process and the noise process
are always compatible because sy > 2Ry and sw > 2Rw, respectively, by Assumption A3.
Note that the second condition in (15) implies that if �y (�) and �w (�) are in�nitely smooth
near frequency zero then any (Ry; Rw) can be chosen and the estimator is n1=2�� consistent for all
� > 0. Hence, in that case, the rate of convergence is arbitrarily close to the parametric rate. Thus,
the condition (15) allows the bandwidth m to be much larger than for the LWN estimator and the
standard LW estimator, which require that (assuming sy � 2; sw � 2) m5n�4 ! 0 and m4d0+1n�4d0 !0, respectively, see Hurvich & Ray (2003) and Arteche (2004). Therefore, Theorem 2 provides an
improvement in the rate of convergence relative to existing estimators of the memory parameter for
perturbed fractional processes. This comes at the cost of an increase in the asymptotic variance by a
multiplicative constant, but this is clearly more than o¤-set by the faster rate of convergence, at least
asymptotically. For example, in the empirically relevant case of d0 = 0:4, which is a typical value of d0for �nancial volatility series, the LW estimator is at most n0:31-consistent and the LWN estimator is
at most n0:4-consistent, whereas our estimator can be arbitrarily close to n0:5-consistent if the spectral
density is su¢ ciently smooth near the zero frequency.
Finally, as in Andrews & Sun (2004) we could calculate the asymptotic bias which is of order
O((m=n)'y + (m=n)2d0+'w), where 'a = min fsa; 2 + 2Rag ; a = y; w, see the proof of Lemma 3(e)
in the appendix. This is in contrast to the orders O((m=n)2) and O((m=n)2d0) for the LWN and
LW estimators in Hurvich & Ray (2003) and Arteche (2004). Thus, as in Andrews & Sun (2004) for
76
the pure long memory case, the order of magnitude of the asymptotic bias is smaller when modeling
the (smooth) spectral density of the short-memory component locally by a polynomial instead of a
constant.
4 Finite sample comparison
In this section we present simulation results to examine the �nite sample bias and root mean squared
error (RMSE) performance of our LPWN estimator. In particular, we want to examine the accuracy
with realistic sample sizes and short-run contamination in both signal and noise.
Our LPWN estimator is implemented with (Ry; Rw) equal to (1; 0), (0; 1), and (1; 1), denoted
LPWN(Ry;Rw), and is compared with the LW, LPW, and LWN estimators. From Hurvich & Ray
(2003) we know that the LWN estimator is superior to the LW estimator in terms of bias and RMSE in
the context of the standard LMSV model. Furthermore, Hurvich et al. (2005) show that the polynomial
log-periodogram regression estimator of Andrews & Guggenberger (2003) su¤ers from severe bias in
the case of perturbed fractional processes and the LPW estimator is expected to perform similarly.
Therefore, to conserve space we only compare the LWN and LPWN estimators in the Monte Carlo
setup. The results for the LW and LPW estimators are avaible from the authors upon request.
4.1 Monte Carlo setup
We simulate model (2), i.e.
zt = yt + xt + wt; (16)
where fytg is the signal process and fxt + wtg is the perturbation process. We model fxtg as anARMA process and fwtg as
wt = log u2t ; ut � NID(0; 1): (17)
Note that the variance of wt is �2w = �2=2 regardless of the variance of ut. The signal process fytgand the ARMA part fxtg of the perturbation process follow di¤erent DGPs. For brevity, we consider�ve di¤erent DGPs for the signal and ARMA perturbation processes. The general setup for fytg andfxtg is
(1� �yL) (1� L)d yt =�1 + �yL
��t; �t � NID(0; �2�); (18)
(1� �xL)xt = (1 + �xL) "t; "t � NID(0; 1); (19)
with parameter con�gurations
Model I : �y = �y = �x = �x = 0;
Model II : �y = �y = �x = 0; �x 2 f�0:8; 0:5g ;Model III : �y = �y = �x = 0; �x 2 f�0:8; 0:8g ;Model IV : �y = �x = 0; (�y; �x) 2 f(�0:8; 0:5); (�0:8; 0:8)g ;Model V : �y = �x = 0; (�y; �x) 2 f(�0:8;�0:8); (�0:8; 0:8)g :
77
We remark that in all the models the noise-to-signal ratio is given as
nsr =fx(0) + fw(0)
f(1�L)dyt(0)=
(1+�x)2
(1��x)2+ �2
2
�2�(1+�y)
2
(1��y)2
: (20)
For each Monte Carlo DGP we generated 1000 arti�cial time series with a sample size of 1024;
2048, 4096, and 8192.6 For all estimators we set the bandwidth as m = bnac, where a 2 f0:6; 0:7; 0:8g.The parameter of interest, d, is set equal to either 0:4 or 0:6. For the noise-to-signal ratio, we choose
nsr 2 f5; 10; 20g ; and the variance �2� is set as a function of �y; �x; �y; and �x such that the nsrhas the desired value. The values of d, nsr, (�y; �y), (�x; �x), and the sample sizes are chosen to
re�ect empirical �ndings on long memory in volatility (see the references in the introduction for some
examples). The chosen parameter values for the short-run contamination in the signal and the noise
are also inspired by the results from the empirical (parametric) analysis of the DJIA stocks in section
5 below.
The signal fytg is generated by the circulant embedding method as described in Davies & Harte
(1987), i.e. the stationary type I fractionally integrated process in the terminology of Marinucci &
Robinson (1999), see also Beran (1994, pp. 215-217). To generate nonstationary series with d � 1=2,we simulate the ARFIMA process with integration order d � 1 and cumulate the resulting series.Numerical optimization was carried out in Matlab v7.2 using the BFGS and DFP optimization routines
and selecting the one with the best log-likelihood value. The initial values were set as follows. For
the LWN estimator we used the LW estimate, dLW ; if it was in the interior of the admissible space of
d; i.e. [0:01; 0:99], c.f. Assumption A2. Otherwise, d was set equal to 0:1. As starting value for the
LPWN estimators we used the LWN estimate if it was in the admissible interval, otherwise d was set
equal to 0:1.7 As initial values for the polynomial parameters we used 1 for all estimators.
To conserve space we present only a subset of the results. The left-out results (d = 0:6, n = 1024,
and m =�n0:6
�) are qualitatively very similar to the ones presented, and are available upon request.
4.2 Monte Carlo results
Tables 1-9 display the results of the simulation study and show how the two di¤erent sources of bias,
i.e. the additive noise term and the contamination from the short-memory dynamics in both the signal
and the noise, a¤ect the estimators.
[Table 1 about here]
In the case where there is no contamination by short-run dynamics in the signal or noise, i.e. Model
I with results displayed in Table 1, the bias is small for all estimators. The theoretical in�ation of the
variances from h (�; d;�) is also noticeable in the RMSEs. Additionally, the RMSE decreases as either
the sample size or bandwidth increase. The only case with any noticeable bias is for the LPWN(1,1)
estimator with nsr = 20, smallest sample size, and highest bandwidth.6The number of observations is chosen as a power of two in order to use the fast Fourier transform in calculating the
periodogram. This speeds up the estimations considerably compared to using the discrete Fourier transform.7We tried di¤erent starting values for d in these cases and the results were indistinguishable.
78
[Tables 2 and 3 about here]
In Tables 2 and 3 we consider model II, i.e the signal is an ARFIMA(0; d; 0) process and the noise
is an ARMA process with coe¢ cients (�x; �x) = (0:5; 0) and (�x; �x) = (�0:8; 0), respectively. Herewe would presume that the LPWN(0,1) estimator is the better choice. We clearly see that we are
able to obtain considerable reduction in bias relative to the LWN estimator, especially for the positive
AR root case in Table 2. In that case we �nd that all three LPWN estimators outperform the LWN
estimator in terms of bias, and for the highest bandwidth choice, the LPWN(0,1) estimator is often
also superior in terms of RMSE. In the model with a negative AR root in Table 3 the results are very
similar to those in Table 1.
[Tables 4 and 5 about here]
We consider next Model III, i.e. where there is MA contamination in the noise, with results
presented in Tables 4 and 5. The results for this model are similar to those in Tables 1 and 3. That
is, for this model there is only little bias in the LWN estimator and no bias in the LPWN estimators.
For the highest bandwidth choice, LWN and LPWN have similar RMSE.
[Tables 6 and 7 about here]
Tables 6 and 7 contain results for Model IV, where��y; �y
�= (�0:8; 0) ; (�x; �x) = (0:5; 0) and�
�y; �y�= (�0:8; 0) ; (�x; �x) = (�0:8; 0), respectively. In the case of Table 6 the LWN estimator
su¤ers from very high bias and the LPWN estimators are able to reduce this bias considerably. In
particular, the LPWN(1,1) estimator is nearly unbiased in most cases. For the high bandwidth the
RMSEs are similar for all estimators. In Table 7, where the contamination is by a negative root, the
performance of the LWN estimator is similar to that of the LPWN estimators.
[Tables 8 and 9 about here]
Results for Model V where��y; �y
�= (0;�0:8) ; (�x; �x) = (0; 0:8) and
��y; �y
�= (0;�0:8) ; (�x; �x) =
(0;�0:8) are shown in Tables 8 and 9, respectively.8 The LWN estimator su¤ers from very severe bias
in Model V, and consequently its RMSE is also higher than for the previous models. On the other
hand, the LPWN estimators have relatively low biases, and in particular the LPWN(1,1) estimator
appears essentially unbiased. When compared in terms of RMSE the LPWN estimators are superior in
both tables as well. Thus, we have a considerable reduction in bias for all LPWN estimators compared
to the LWN estimator, and we also have quite a remarkable reduction in RMSE.
To sum up, the Monte Carlo study shows the usefulness of estimators that explicitly take the short-
run dynamics in the perturbation into account, i.e. the LPWN estimators where (Ry; Rw) = (0; 1)
and (Ry; Rw) = (1; 1), although the LPWN estimator with (Ry; Rw) = (1; 0) also performs well. All
three estimators generally have much smaller biases than the LWN estimator and are fairly insensitive
to the persistence in the perturbation and to the contamination from short-memory dynamics in the
signal.8 In a few cases for the LWN estimator (marked with asterisks) we had convergence problems due to boundary issues
resulting in a markedly bimodal �nite-sample distribution. In these cases we set the initial value for the polynomial
parameter to 10, which resolved the issue.
79
5 Long memory in DJIA stock volatility
This section analyzes the long memory in daily log-squared returns series of the 30 DJIA stocks
corrected for the e¤ects of stock splits and dividends from January 1 1990 to March 31 2008, for
a sample of n = 4753. To avoid the problem of taking logarithm of zero we based the analysis on
adjusted log-squared returns using the method of Fuller (1996, pp. 495-496), i.e. we analyze
log ~r2t = log�r2t + �
�� �
r2t + �;
where � = 0:02n
Pnt=1 r
2t . We estimate the long memory in log ~r
2t using the proposed LPWN estimator.
We implement the estimator with (Ry; Rw) equal to (1; 0), (0; 1), and (1; 1), and with starting values
etc. as in the Monte Carlo study above. For comparison we also report the standard LW, LPW, and
LWN estimates. For all estimators we set the bandwidth as m = bnac, where a 2 f0:6; 0:7; 0:8g.
[Table 10 about here]
Table 10 presents the results for the LW, LPW, and LWN estimators. As expected from theory,
the LW and LPW estimators appear downward biased and are decreasing in the bandwidth. For the
LWN estimator the memory estimates of some of the stocks are in the stationary region, but for the
most part they are in the nonstationary region.
[Table 11 about here]
In Table 11 we present the results for the three variants of the LPWN estimator, i.e. for (Ry; Rw)
equal to (1; 0), (0; 1), and (1; 1). First of all, as expected from theory and the simulations above,
it is clear that this estimator does not su¤er from the downward bias present in the LW and LPW
estimators. Second, we note that the three di¤erent implementations of the estimator agree with each
other for most of the stocks and bandwidth choices. Thirdly, the LPWN estimates are of the same
order of magnitude as the LWN estimates, although a little higher on average.
To emphasize the importance of the polynomial approximation of the signal process fytg andthe pertubation process fxt + wtg, we also �tted an extended parametric LMSV-ARFIMA(1; d; 1)model, where the extension is that the noise is modeled by an ARMA process. That is, we model the
periodogram of log ~r2t using the Whittle likelihood framework of Fox & Taqqu (1986) and Breidt et al.
(1998), where the �tted model has spectral density
fz (�) =�2�2�
�2 sin
�
2
��2d �1 + 2�y cos�+ �2y��1� 2�y cos�+ �2y
� + �2"2�
�1 + 2�x cos�+ �
2x
�(1� 2�x cos�+ �2x)
: (21)
In Table 12 the resulting estimates are reported, where we have removed insigni�cant ARMA terms
from both the signal and the noise.
[Insert Table 12 about here]
The estimated values of d from the parametric results are in line with those from the LWN and
LPWN estimators in Tables 10 and 11. Furthermore, there is signi�cant (at 10% level) short-run
80
dynamics in the signal (19 out of 30 cases), in the noise (16 out of 30 cases), and in both the signal
and noise (13 out of 30 cases). The estimated (long-run) nsr�s can be calculated from the parameter
estimates as in (20), and are for most of the stocks in the vicinity of 10 � 30, although there arecases where the nsr is very high because �2� is very small and insigni�cant. Taking the high nsr�s
and signi�cant short-run dynamics in both the signal and the noise into consideration stresses the
importance of the LPWN estimators.
6 Concluding remarks
In this paper we have proposed a semiparametric local polynomial Whittle with noise estimator of
the degree of long memory, d, in �nancial volatility time series perturbed by dynamic short-run noise.
The estimator allows the spectrum of the perturbation and that of the short-memory component of
the signal to be modeled as �nite even polynomials, instead of constants near the zero frequency. This
is shown to yield a bias reduction depending on the smoothness of the spectra. However, including
the polynomials in�ates the asymptotic variance of d by a multiplicative constant which depends on
the true long memory parameter, d.
We have shown that the estimator is consistent for d 2 (0; 1), asymptotically normal for d 2(0; 3=4), and if the spectral density is su¢ ciently smooth near frequency zero the rate of convergence
becomes arbitrary close to the parametric rate,pn.
A Monte Carlo study revealed that the proposed local polynomial Whittle with noise estimator
is able to achieve considerable bias reductions in practice compared to standard (e.g., local Whittle
with noise) estimators, especially in cases with short-run dynamics in both the signal and noise com-
ponents. In an empirical investigation of the 30 DJIA stocks the local polynomial Whittle with noise
estimator indicated stronger persistence in volatility than standard estimators, and for most of the
stocks produced estimates of d in the nonstationary region.
Appendix A: Proof of Theorem 1
This proof follows the proofs of Theorem 3.1 and Lemma C.2 of Hurvich et al. (2005). As in the proofs
of Theorem 1 of Robinson (1995a) and Theorem 3.1 of Hurvich et al. (2005), to show consistency of d
we need to separately prove that limn!1 P (d 2 D1) = 0 and that (d� d0)1(d 2 D2)P! 0, where 1(A)
is the indicator function of the set A, D1 = (�1; d0� 1=2+ �)\D, D2 = [d0� 1=2+ �;+1)\D, and� < 1=4 is a positive real number to be set later.
Let �k (d;�) =1+hk(d0;�0)1+hk(d;�)
. Then the proof that (d� d0)1(d 2 D2)P! 0 follows as in Hurvich et al.
(2005, pp. 1303-1305) by showing that
Zm =mXk=1
k2(d�d0)�k(d;�)Pmj=1 j
2(d�d0)�j(d;�)
�Iz (�k)
fz (�k)� 1�= oP (1) (22)
uniformly on (d;�) 2 D2 �� and that
Rm(d;�) = log
1 +
Pmk=1 k
2(d�d0) (�k (d;�)� 1)Pmj=1 j
2(d�d0)
!� 1
m
mXk=1
log (1 + (�k (d;�)� 1)) = o (1) (23)
81
uniformly on (d;�) 2 D ��.Note that there exists a constant C > 0 such that
sup(d;�)2D��
supk=1;:::;m
j�k (d;�)� 1j = sup(d;�)2D��
supk=1;:::;m
����hk(d0;�0)� hk(d;�)1 + hk(d;�)
���� � C (m=n)2d1 ;since � is compact and d � d1 > 0, see Lemma 4. Now we use that log (1 + x) = x+O(x2) as x! 0
to obtain
sup(d;�)2D��
jRm(d;�)j � C sup(d;�)2D��
supk=1;:::;m
j�k (d;�)� 1j � C (m=n)2d1 = o (1) :
To show (22) we apply Proposition A.1 of Hurvich et al. (2005), which holds here since our
Assumptions A1-A6 imply their Assumptions (H1)-(H3) with the exception that we allow serially
correlated peturbation terms. It is, however, easily shown that replacing their Assumption (H2)
with our Assumption A5, their Proposition A.1 still holds. The only other change is that the term
(k=n)min(�;d0) in their eq. (F.15) should be replaced by (k=n)'y + (k=n)'w due to the more accurate
approximation of fz (�) o¤ered by our function g(�) in (10) due to the included polynomials, see also
Lemma 5 below. Thus, according to their Proposition A.1, letting
ck =k2(d�d0)�k(d;�)Pmj=1 j
2(d�d0)�j(d;�);
then for � 2 (0; 1), K 2 (0;1), and all k 2 f1; : : : ;m� 1g, we need to show that
jck � ck+1j � Km��k��2; jcmj � Km�1
uniformly on (d;�) 2 D2 ��, which implies (22).Note that, uniformly on (d;�) 2 D2 ��, we have that
Pmj=1 j
2(d�d0)�j(d;�) � Cm2(d�d0)+1 and
jk2(d�d0)�k(d;�)� (k + 1)2(d�d0)�k+1(d;�)j� jk2(d�d0) � (k + 1)2(d�d0)j�k(d;�) + (k + 1)2(d�d0)j�k(d;�)� �k+1(d;�)j� (k + a)2(d�d0)�1C + (k + 1)2(d�d0)C(�k+1 � �k)�2d�1k+a ; a 2 [0; 1]� Ck2(d�d0)�1;
where the �rst inequality is the triangle inequality and the second follows from the mean value theorem
and Lemma 4. It follows that
sup(d;�)2D2��
�����k2(d�d0)�k(d;�)� (k + 1)2(d�d0)�k+1(d;�)Pmj=1 j
2(d�d0)�j(d;�)
����� � sup(d;�)2D2��
C
����� k2(d�d0)�1m2(d�d0)+1
����� � Ck2��2m�2�;
sup(d;�)2D2��
����� m2(d�d0)�m(d;�)Pmj=1 j
2(d�d0)�j(d;�)
����� � Cm�1;
which proves (22).
The proof that limn!1 P (d 2 D1) = 0 follows exactly as in Hurvich et al. (2005, pp. 1305-1306)since their Proposition A.1 holds in our case as well. Thus we have shown that d P! d0. To strengthen
this result to d� d0 = oP ((log n)�5) we use the proof of Lemma C.2 of Hurvich et al. (2005) withoutchange.
82
Appendix B: Proof of Theorem 2
For the proof of Theorem 2 we need the score and Hessian (both multiplied by m) of (11):
Sn (d;�) = G (d;�)�1mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!Xj ;
Hn (d;�) = H1n (d;�) +H2n (d;�) ;
H1n (d;�) = G (d;�)�2
0@G (d;�) mXj=1
GIz (�j)
gj (d;�)XjX
0j �m
0@ 1
m
mXj=1
GIz (�j)
gj (d;�)Xj
1A0@ 1
m
mXj=1
GIz (�j)
gj (d;�)Xj
1A01A ;H2n(d;�) = G (d;�)�1
mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!@Xj@(d;�0)
;
where
Xj = (X1j ;X02j ;X
03j)
0;
X1j = 2 log j �2hw(�w; �j)�
2dj log �j
(1 + hj (d;�));
X2j =
��2j
(1 + hj (d;�)); : : : ;
��2Ryj
(1 + hj (d;�))
!0;
X3j =
��2dj
(1 + hj (d;�)); : : : ;
��2d+2Rwj
(1 + hj (d;�))
!0;
hj (d;�) = h(d;�; �j), gj (d;�) = ��2dj G (1 + hj(d;�)), and Dm (�) = fd 2 D : (logm)5 jd � d0j < �gfor � > 0. Note that Xj is the vector of partial derivatives of � log gj(d;�). The matrix H2n(d;�) is
symmetric and has (i; l)�th and (l; i)�th elements
G (d;�)�1mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(Xj)i (Xj)l ; i; l = 2; : : : ; R+ 2;
�G (d;�)�1mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(Xj)i
2hw(�w; �j)�2dj log �j
(1 + hj(d;�)); i = 2; : : : ; Ry + 1; l = 1;
G (d;�)�1mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(Xj)i 2 log �j
1�
hw(�w; �j)�2dj
(1 + hj(d;�))
!; i = Ry + 2; : : : ; R+ 2; l = 1;
G (d;�)�1mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(Xj)Ry+2 4hw(�w; �j) (log �j)
2
1�
hw(�w; �j)�2dj
(1 + hj(d;�))
!; i = l = 1:
We also de�ne the matrix
Jn =mXj=1
Xj �
1
m
mXk=1
Xk
! Xj �
1
m
mXk=1
Xk
!0:
We next state a lemma adapted from Andrews & Sun (2004), henceforth abbreviated AS. The
proof is given in the next section.
83
Lemma 3 Under the assumptions of Theorem 2 we have, as n!1,(a) B�1n JnB
�1n ! Ry ;Rw ;
(b) B�1n (H1n (d0;�0)� Jn)B�1n
= oP (1) and B�1n H2n (d0;�0)B�1n
= oP (1) ;(c) sup�2�
B�1n (Hkn (d0;�)�Hkn (d0;�0))B�1n
= oP (1) ; k = 1; 2;(d) supd2Dm(�n);�2�
B�1n (Hkn (d;�)�Hkn (d0;�))B�1n
= oP (1) ; k = 1; 2; for all sequences ofconstants f�ngn�1 for which �n = o (1) ;
(e) B�1n Sn (d0;�0)d! N
�0;Ry ;Rw
�:
Since the LPWN likelihood (11) is a continuous function on a compact set the LPWN estimator
exists. From Lemma 3 we know by Lemma 1 of AS that there exists a solution to the �rst order
conditions with probability tending to one, and that the solution satis�es the convergence result in
Theorem 2, see also Lemmas 1 and 2 of AS. If the (negative) likelihood function is strictly convex and
twice di¤erentiable then the solution to the �rst order conditions is unique and minimizes (11) and
hence equals the LPWN estimator.
Thus, all that remains is to show that the Hessian is positive de�nite which proves convexity. The
positive de�niteness of H1n follows as in eq. (5.1) of AS. Compared to AS we have the additional
term H2n. For H2n we know that B�1n H2n(d;�)B
�1n
= oP (1) uniformly on (d;�) 2 Dm (�n) � �by Lemma 3(b)-(d) and the triangle inequality. Since d 2 Dm (�n) with probability tending to one byTheorem 1, this shows that Hn is positive de�nite with probability tending to one, which concludes
the proof.
Appendix C: Proof of Lemma 3
We now turn to the proof of Lemma 3, which follows the method of proof for Lemma 2 of AS, with
modi�cations to allow d � 1=2 (following Velasco (1999)) and to accommodate the additive noise termin the spectral density (see Lemma 5), and with an additional proof for each of (b), (c), and (d) of
negligibility of the term H2n(d;�).
C.1 Proof of (a)
Part (a) of the lemma follows by approximating sums by integrals, see, e.g., Lemma 2 of Andrews &
Guggenberger (2003).
84
C.2 Proof of (b), �rst statement
The proof roughly follows that of Lemma 2(b) in AS, except now b can be non-integer (equal to d or
2d) in their eq. (A.6), which we write a little di¤erently as
~Ga;b;c(d;�) = m�1mXj=1
�2dj Iz(�j)
(1 + hj(d;�))c+1
2 log j �
2hw(�w; �j)�2dj log �j
(1 + hj(d;�))
!a�j
m
�2b;
Ga;b(d;�) = m�1mXj=1
�2dj Iz(�j) (2 log j)a
�j
m
�2b;
Ja;b = Gm�1mXj=1
(2 log j)a�j
m
�2b;
for a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d+1; : : : ; d+Rw+Ry; 2d; 2d+1; : : : ; 2d+2Rw. The elements
of B�1n H1n (d;�)B�1n are (omitting the argument for brevity)
(1; 1) : ~G�20;0;0
�~G0;0;0 ~G2;0;0 � ~G21;0;0
�;
(1; 1 + k) : ~G�20;0;0
�~G0;0;0 ~G1;k;1 � ~G1;0;0 ~G0;k;1
�for k = 1; : : : ; Ry;
(1; 2 +Ry + k) : ~G�20;0;0
�~G0;0;0 ~G1;k+d;1 � ~G1;0;0 ~G0;k+d;1
�for k = 0; : : : ; Rw;
(1 + i; 1 + k) : ~G�20;0;0
�~G0;0;0 ~G0;i+k;2 � ~G0;i;1 ~G0;k;1
�for i; k = 1; : : : ; Ry;
(1 + i; 2 +Ry + k) : ~G�20;0;0
�~G0;0;0 ~G0;k+i+d;2 � ~G0;i;1 ~G0;k+d;1
�for i = 1; : : : ; Ry; k = 0; : : : ; Rw;
(2 +Ry + i; 2 +Ry + k) : ~G�20;0;0
�~G0;0;0 ~G0;k+i+2d;2 � ~G0;i+d;1 ~G0;k+d;1
�for i; k = 0; : : : ; Rw;
and the corresponding elements of B�1n Jn (d;�)B�1n are given by the same expressions with ~Ga;b;c
replaced by Ja;b. To prove the �rst statement of Lemma 3(b) it su¢ ces to show that (since b can take
values including d, we distinguish between b and b0)
�a;b0 =���Ga;b0(d0;�0)� Ja;b0��� = oP ((logm)�2); (24)
~�a;b0;c =��� ~Ga;b0;c(d0;�0)� Ga;b0(d0;�0)��� = oP ((logm)�2): (25)
In view of Lemma 5 below, the proof of (A.9) in AS pp. 598-599 works also for our eq. (24) where
we �nd that (�k;n(d) is de�ned in Lemma 5)
�a;b0 = OP
�(logm)am�1�m;n(d0) + (logm)
am'yn�'y + (logm)amd0+'wn�d0�'w
+ (logm)am2d0n�2d0 + (logm)a+1m2d0�1n�d0 + (logm)am�1=2�;
which is
OP
�(logm)a+2=3m�2=3 + (logm)am�1=2n�1=4 + (logm)a(m=n)min('y ;d0+'w;2d0)
+(logm)a+1m2d0�1n�d0 + (logm)am�1=2�
85
in the stationary case and
OP
�(logm)a+2=(5�4d0)m1=(5�4d0)�1 + (logm)a+1m2d0�2 + (logm)am(d0�1)=2n�1=2(log n)5=4
+(logm)a+1=2n�1=4md0�1 + (logm)a(m=n)min('y ;d0+'w;2d0) + (logm)a+1m2d0�1n�d0 + (logm)am�1=2�
in the nonstationary case. Since d0 < d2 < 3=4 and by (15), clearly �a;b0 = oP ((logm)�2) in both
cases.
To prove (25) we write ~Ga;b0;c(d0;�0)� Ga;b0(d0;�0) as
m�1mXj=1
�2d0j Iz(�j)
"1
(1 + hj(d0;�0))c+1
2 log j �
2hw(�w;0; �j)�2d0j log �j
(1 + hj(d0;�0))
!a� (2 log j)a
#�j
m
�2b0= m�1
mXj=1
�2d0j Iz(�j)
�1
1 +O((j=n)2d0)
�2 log j � O((j=n)
2d0 log n)
1 +O((j=n)2d0)
�a� (2 log j)a
��j
m
�2b0by Lemma 4(i). This proves (25) for a = 0 since
~G0;b0;c(d0;�0)� G0;b0(d0;�0) = m�1mXj=1
�2d0j Iz(�j)
�1
1 +O((j=n)2d0)� 1��
j
m
�2b0= OP
�(m=n)2d0G0;b0(d0;�0)
�= OP
�(m=n)2d0
�= oP ((logm)
�2)
because d0 belongs to the interior of the parameter space and is therefore bounded away from zero.
When a � 1 we apply the mean value theorem, i.e. xa = ya + (y � x)a�xa�1 for x � �x � y, such that 2 log j �
2hw(�w;0; �j)�2d0j log �j
(1 + hj(d0;�0))
!a� (2 log j)a = a
2hw(�w;0; �j)�2d0j log �j
(1 + hj(d0;�0))O((log j)a�1)
uniformly in j = 1; : : : ;m. This impiles that (25) is
m�1mXj=1
�2d0j Iz(�j)haO((j=n)2d0 log n)O((log j)a�1)
i� jm
�2b0= OP ((m=n)
2d0(log n)(logm)a�1G0;b0(d0;�0))
= OP
�(m=n)2d0(log n)a
�= oP ((logm)
�2):
C.3 Proof of (e)
We now prove part (e) since it will be useful in the proof of the remaining statements. By (24) and
(25) with a = b = c = 0 we get that G (d0;�0) = G0(1 + oP ((logm)�2)), so that, apart from smaller
order terms,
B�1n Sn (d0;�0) = m�1=2mXj=1
Iz (�j)
gj (d0;�0)� 1
m
mXk=1
Iz (�k)
gk (d0;�0)
!~X0;j
= m�1=2mXj=1
�Iz (�j)
gj (d0;�0)� 1�
~X0;j �1
m
mXk=1
~X0;k
!; (26)
86
where
~Xj = (X1;j ; ~X02;j ; ~X
03;j)
0;
~X2;j =
��(j=m)2
(1 + hj(d;�)); : : : ;
�(j=m)2Ry(1 + hj(d;�))
�0;
~X3;j =
��(j=m)2d
(1 + hj(d;�)); : : : ;
�(j=m)2d+2Rw(1 + hj(d;�))
�0;
and ~X0;j is ~Xj evaluated at (d0;�0).
As in AS p. 601 we write the right-hand side of (26) as T1;n + T2;n + T3;n + T4;n, where
T1;n = m�1=2mXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)� E
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
�� ~X0;j �
1
m
mXk=1
~X0;k
!;
T2;n = m�1=2mXj=1
�EIz (�j)
fz (�j)� 1�
fz (�j)
gj (d0;�0)
~X0;j �
1
m
mXk=1
~X0;k
!;
T3;n = m�1=2mXj=1
(2�I" (�j)� 1) ~X0;j �
1
m
mXk=1
~X0;k
!;
T4;n = m�1=2mXj=1
�fz (�j)
gj (d0;�0)� 1�
~X0;j �1
m
mXk=1
~X0;k
!:
Then we show that T3;nd! N (0;r) while Ti;n = oP (1) for i = 1; 2; 4.
Clearly the proof for T3;n of AS works here as well. We just have to verify that
1
m
mXj=1
�2j ! �0Ry ;Rw�;
where
�j = �0(~X0;j �
1
m
mXk=1
~X0;k) and Ry ;Rw =
0B@ 4 �0Ry � 0Rw�Ry �Ry 0Ry ;Rw�Rw Rw;Ry Rw
1CA ;which follows from part (a) of the lemma.
To show the result for T1;n we use summation by parts:
T1;n = m�1=2m�1Xk=1
�~X0;k � ~X0;k+1
� kXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)� E
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
��
+
~X0;m �
1
m
mXk=1
~X0;k
!m�1=2
mXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)� E
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
��
= m�1=2m�1Xk=1
O(k�1)OP (�k;n(d0) + k'y+1=2n�'y + k1=2+2d0n�2d0)
+O(1)m�1=2OP (�m;n(d0) +m'y+1=2n�'y +m1=2+2d0n�2d0)
= OP (m�1=2(logm)�m;n(d0) + (m=n)
min('y ;2d0));
87
where �k;n(d) is de�ned in Lemma 5. The second equality above applies Lemma 5 and that ~X0;k �~X0;k+1 = O(k
�1) uniformly in k = 1; : : : ;m and ~X0;m � 1m
Pmk=1
~X0;k = O(1) (follows from approx-
imating sums by integrals, see also AS p. 602). Thus T1;n = OP ((logm)5=3m�1=6 + (logm)n�1=4 +
(m=n)min('y ;2d0)) in the stationary case and T1;n = OP ((logm)1+2=(5�4d0)m�(3�4d0)=(10�8d0)+(logm)2m2d0�3=2+
(logm)(log n)5=4n�1=2md0=2 + (logm)3=2n�1=4md0�1=2 + (m=n)min('y ;2d0)) in the nonstationary case.
Since d0 belongs to the interior of the parameter space it follows that T1;n = oP (1).
To prove the result for T2;n we use Robinson�s (1995b) Theorem 2, i.e., that EIy (�j) =fy (�j) =
1+O(j�1(log j)) uniformly in j = 1; : : : ;m in the stationary case, as well as Velasco�s (1999) Theorem
1, EIy (�j) =fy (�j) = 1 + O(j2d0�2(log j)) uniformly in j = 1; : : : ;m in the nonstationary case. Note
that, as in AS, the remainder terms are di¤erent from those of Robinson (1995b) and Velasco (1999)
because of the normalization by fy (�j) rather than by G0��2d0j . Thus, as in the proof of Lemma 5
we can write
EIz (�j)
fz (�j)� 1 =
fy (�j)� fz (�j)fz (�j)
�EIy (�j)
fy (�j)� 1�+
�EIy (�j)
fy (�j)� 1�
+2pfy (�j)
fz (�j)
E Re (Iyw(�j))pfy (�j)
+EIw(�j) + fy (�j)� fz (�j)
fz (�j):
Because EIw (�j) = fw(�j)+O(j�1(log j)) and fz (�j)�fy (�j) = fw (�j), the last term isO(j�1(log j)�2d0j ).
By the same reasoning and by independence of fytg and fwtg, the second to last term isOP (�d0j j�1(log j))in the stationary case and OP (�
d0j j
2d0�2(log j)) in the nonstationary case (see also the proof of Lemma
5 below and the second to last equation on p. 108 of Velasco (1999)). We thus obtain the bounds
EIz (�j) =fz (�j)�1 = O(j�1(log j)) for the stationary case and EIz (�j) =fz (�j)�1 = O(j2d0�2(log j))for the nonstationary case, for all j = 1; : : : ;m. We also have that fz (�j) =gj (d0;�0)�1 = O((j=n)'y+(j=n)2d0+'w) for all j = 1; : : : ;m by (13). Therefore, in the stationary case, T2;n can be bounded sim-
ilarly to (A.24) of AS,
T2;n = m�1=2mXj=1
O(j�1(log j))O(1)O(logm)
= O
0@m�1=2(logm)mXj=1
j�1(log j)
1A= O((logm)3m�1=2);
using also that j~X0;j � 1m
Pmk=1
~X0;kj = O(logm) uniformly in j = 1; : : : ;m. In the nonstationary casewe �nd in the same way that
T2;n = m�1=2mXj=1
O(j2d0�2(log j))O(1)O(logm)
= O
0@m�1=2(logm)mXj=1
j2d0�2(log j)
1A= O((logm)3m2d0�3=2):
In both the stationary and nonstationary cases, T2;n is o(1) since d0 < d2 < 3=4.
88
The proof for T4;n follows from summation by parts and the approximation fz (�j) =gj (d0;�0)�1 =O((j=n)'y + (j=n)2d0+'w) for all j = 1; : : : ;m, which implies that
T4;n = m�1=2m�1Xk=1
�~X0;k � ~X0;k+1
� kXj=1
�fz (�j)
gj (d0;�0)� 1�
+
~X0;m �
1
m
mXk=1
~X0;k
!m�1=2
mXj=1
�fz (�j)
gj (d0;�0)� 1�
= m�1=2m�1Xk=1
O(k�1)kXj=1
O((j=n)'y + (j=n)2d0+'w)
+O(1)m�1=2mXj=1
O((j=n)'y + (j=n)2d0+'w)
= O(m1=2+'yn�'y +m1=2+2d0+'wn�2d0�'w):
Condition (15) shows that this is oP (1).
C.4 Proof of (b), second statement
To prove the second statement of Lemma 3(b) we have to show that
1
mG (d;�)�1
mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(~Xj)i(~Xj)l; i; l = 2; : : : ; R+ 2;
� 1mG (d;�)�1
mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(~Xj)i
2hw(�w; �j)�2dj (log �j)
(1 + hj(d;�)); i = 2; : : : ; Ry + 1;
1
mG (d;�)�1
mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(~Xj)i2(log �j)
1�
hw(�w; �j)�2dj
(1 + hj(d;�))
!; i = Ry + 2; : : : ; R+ 2;
1
mG (d;�)�1
mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(~Xj)Ry+24hw(�w; �j) (log �j)
2
1�
hw(�w; �j)�2dj
(1 + hj(d;�))
!;
are all negligible when evaluated at (d0;�0). Note that it su¢ ces to prove the result for the generic
term
Vn (d;�) =1
mG (d;�)�1
mXj=1
GIz (�j)
gj (d;�)� 1
m
mXk=1
GIz (�k)
gk (d;�)
!(~Xj)Ry+2qj(d;�); (27)
89
where qj(d0;�0) depends on j but is at most of order O (log n) and satis�es qj+1(d0;�0)� qj(d0;�0) =O(j�1) uniformly in j = 1; : : : ;m. Summation by parts on Vn(d0;�0) yields
Vn (d0;�0) =1
mG (d0;�0)
�1 qm(d0;�0)mXj=1
GIz (�j)
gj (d0;�0)� 1
m
mXk=1
GIz (�k)
gk (d0;�0)
!(~X0;j)Ry+2
+1
mG (d0;�0)
�1m�1Xk=1
(qk(d0;�0)� qk+1(d0;�0))kXj=1
GIz (�j)
gj (d0;�0)� 1
m
mXk=1
GIz (�k)
gk (d0;�0)
!(~X0;j)Ry+2
= m�1qm(d0;�0)OP (m1=2) +m�1
m�1Xk=1
(qk(d0;�0)� qk+1(d0;�0))OP (k1=2)
= OP
�m�1=2(log n) +m�1=2
�;
where the second equality follows from part (e) of the lemma.
C.5 Proof of (c)
First we prove the result for H1n, where we need to show that
sup�2�
��� ~Ga;b0;c(d0;�)� ~Ga;b0;c(d0;�0)��� = oP ((logm)�2)
for a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d + 1; : : : ; d + Rw + Ry; 2d; 2d + 1; : : : ; 2d + 2Rw. By the
triangle inequality it su¢ ces to show that
sup�2�
��� ~Ga;b0;c(d0;�)� Ga;b0(d0;�)���+ sup�2�
���Ga;b0(d0;�)� Ga;b0(d0;�0)���+ ~�a;b0;c = oP ((logm)�2): (28)
We showed in (25) that ~�a;b0;c = oP ((logm)�2).
Following the proof of (25), the �rst term on the left-hand side of (28) is
sup�2�
������m�1mXj=1
�2d0j Iz(�j)
"1
(1 + hj(d0;�))c+1
2 log j �
2hw(�w; �j)�2d0j log �j
(1 + hj(d0;�))
!a� (2 log j)a
#�j
m
�2b0������= m�1
mXj=1
�2d0j Iz(�j)
�1
1 +O((j=n)2d0)
�2 log j � O((j=n)
2d0 log n)
1 +O((j=n)2d0)
�a� (2 log j)a
��j
m
�2b0by Lemma 4(i), which proves the result for the �rst term of (28) by the same arguments as those
applied to (25).
For n su¢ ciently large, gj (d0;�0) > 0 for all j = 1; : : : ;m, and then the second term on the
left-hand side of (28) is
sup�2�
������m�1mXj=1
Iz(�j)
gj (d0;�0)
�gj (d0;�0)
gj (d0;�)� 1�(2 log j)a
�j
m
�2b0������= sup
�2�;j=1;:::;m
����gj (d0;�0)gj (d0;�)� 1����m�1
mXj=1
Iz(�j)
gj (d0;�0)(2 log j)a
�j
m
�2b0= Ga;b0(d0;�0) sup
�2�;j=1;:::;m
����hj (d0;�0)� hj (d0;�)1 + hj (d0;�)
���� ;90
noting that all the terms inside the summation on the right-hand side of the second equality are
positive. From Lemma 4(ii) and the fact that Ga;b0(d0;�0) = OP ((logm)a) by (24), it thus follows
that the second term on the left-hand side of (28) is OP ((logm)a(1+ o(1))�1�2d0m ), which proves (28).
Next we prove the result for H2n. Again, it su¢ ces to show the result for the generic term Vn (d;�)
de�ned in (27), i.e. we must show that sup�2� jVn(d0;�)� Vn(d0;�0)j = oP (1). By (24) and (28) wehave that
sup�2�
G(d0;�) = G(1 + oP ((logm)�2)); (29)
and sup�2� jVn(d0;�)� Vn(d0;�0)j is, apart from a term that is negligible uniformly in �,
sup�2�
������ 1mmXj=1
Iz (�j)
gj (d0;�)� 1
m
mXk=1
Iz (�k)
gk (d0;�)
!(j=m)2d0
(1 + hj(d0;�))qj(d0;�)
� 1m
mXj=1
Iz (�j)
gj (d0;�0)� 1
m
mXk=1
Iz (�k)
gk (d0;�0)
!(j=m)2d0
(1 + hj(d0;�0))qj(d0;�0)
������� sup
�2�
������ 1mmXj=1
�Iz (�j)
gj (d0;�)
(j=m)2d0
(1 + hj(d0;�))qj(d0;�)�
Iz (�j)
gj (d0;�0)
(j=m)2d0
(1 + hj(d0;�0))qj(d0;�0)
������� (30)+ sup�2�
������ 1mmXj=1
1
m
mXk=1
�Iz (�k)
gk (d0;�)
qj(d0;�)
(1 + hj(d0;�))� Iz (�k)
gk (d0;�0)
qj(d0;�0)
(1 + hj(d0;�0))
��j
m
�2d0������ :(31)By the triangle inequality, (30) is bounded by
sup�2�
������ 1mmXj=1
Iz (�j)
gj (d0;�0)
�j
m
�2d0 � qj(d0;�)
(1 + hj(d0;�))� qj(d0;�0)
(1 + hj(d0;�0))
������� (32)
+ sup�2�
������ 1mmXj=1
�j
m
�2d0 � Iz (�j)
gj (d0;�)� Iz (�j)
gj (d0;�0)
�qj(d0;�)
(1 + hj(d0;�))
������ : (33)
Note that, by inspection of the de�nition of qj(d;�) in (27) we have two prototypical expressions for
the di¤erence appearing in (32),
qj(d0;�)
(1 + hj(d0;�))� qj(d0;�0)
(1 + hj(d0;�0))=
8<: O�
(j=m)2d0
(1+hj(d0;�))� (j=m)2d0
(1+hj(d0;�0))
�;
O�4 (hw(�w; �j)� hw(�w;0; �j)) (m=n)2d0 (log �j)2
�:
(34)
Inserting the �rst term of (34) into (32) we obtain, since for n su¢ ciently large gj (d0;�0) > 0 for all
j = 1; : : : ;m,
(32) = OP
0@sup�2�
������ 1mmXj=1
Iz (�j)
gj (d0;�0)
�j
m
�4d0 �(1 + hj(d0;�0))� (1 + hj(d0;�))(1 + hj(d0;�)) (1 + hj(d0;�0))
�������1A
= OP
G0;2d0(d0;�0) sup
�2�;j=1;:::;m
���� hj(d0;�0)� hj(d0;�)(1 + hj(d0;�)) (1 + hj(d0;�0))
����!;
91
and by Lemma 4(ii) it follows that (32) is OP (�2d0m ) in this case. Inserting the second term of (34)
into (32) we obtain
(32) = OP
G0;d0(d0;�0) sup
�2�;j=1;:::;m
����(hw(�w; �j)� hw(�w;0; �j)) (m=n)2d0 (log �j)2����!= OP
�(m=n)2d0(log n)2
�by compactness of �. It follows that (32) is oP (1). Applying summation by parts to (33) we get the
bound
sup�2�
������ qm(d0;�)
(1 + hm(d0;�))
1
m
mXj=1
�j
m
�2d0 � Iz (�j)
gj (d0;�)� Iz (�j)
gj (d0;�0)
�������+ sup�2�
������ 1mm�1Xk=1
�qk(d0;�)
(1 + hk(d0;�))� qk+1(d0;�)
(1 + hk+1(d0;�))
� kXj=1
�j
m
�2d0 � Iz (�j)
gj (d0;�)� Iz (�j)
gj (d0;�0)
������� ;where the �rst term is
sup�2�
���� qm(d0;�)
(1 + hm(d0;�))
1
G
�G0;d0(d0;�)�G0;d0(d0;�0)
����� = oP �(log n)(logm)�2�by (24), (28), and sup�2� qm(d0;�) = O(log n), and the second term is
OP
sup�2�
����� 1mm�1Xk=2
�qk(d0;�)
(1 + hk(d0;�))� qk+1(d0;�)
(1 + hk+1(d0;�))
��k
m
�2d0k(log k)�2
�����!
= OP
sup�2�
����� 1mm�1Xk=2
�qk(d0;�)� qk+1(d0;�)
(1 + hk(d0;�))
��k
m
�2d0k(log k)�2
�����!
+OP
sup�2�
����� 1mm�1Xk=2
qk+1(d0;�)
�1
(1 + hk(d0;�))� 1
(1 + hk+1(d0;�))
��k
m
�2d0k(log k)�2
�����!;
which, using qk(d0;�)� qk+1(d0;�) = O(k�1) and qk+1(d0;�) = O(log n) for any �, is
OP
1
m
m�1Xk=2
�k
m
�2d0(log k)�2 +
1
m
m�1Xk=1
(log n)��2d0k+1 � �
2d0k
�� km
�2d0k(log k)�2
!
= OP
1
m
m�1Xk=2
�k
m
�2d0(log k)�2 + (log n)�2d0m
1
m
m�1Xk=1
�k
m
�2d0(log k)�2
!= OP
�(logm)�2 + (log n)(logm)�2(m=n)2d0
�:
Thus both terms of (33) are oP (1) under (15).
Along the same lines we rewrite (31) as
sup�2�
������ 1mmXj=1
�qj(d0;�)
(1 + hj(d0;�))� qj(d0;�0)
(1 + hj(d0;�0))
��j
m
�2d0 1m
mXk=1
Iz (�k)
gk (d0;�0)
������+ sup�2�
������ 1mmXj=1
qj(d0;�)
(1 + hj(d0;�))
�j
m
�2d0 1m
mXk=1
�Iz (�k)
gk (d0;�)� Iz (�k)
gk (d0;�0)
�������92
and, using the de�nition of Ga;b(d;�), this is equal to
sup�2�
������G0;0(d0;�0)G
1
m
mXj=1
�qj(d0;�)
(1 + hj(d0;�))� qj(d0;�0)
(1 + hj(d0;�0))
��j
m
�2d0������+ sup�2�
������ 1G�G0;0(d0;�)� G0;0(d0;�0)
� 1m
mXj=1
qj(d0;�)
(1 + hj(d0;�))
�j
m
�2d0������= OP
0@sup�2�
������ 1mmXj=1
�qj(d0;�)
(1 + hj(d0;�))� qj(d0;�0)
(1 + hj(d0;�0))
��j
m
�2d0������1A
+oP
0@sup�2�
������(logm)�2 1mmXj=1
qj(d0;�)
(1 + hj(d0;�))
�j
m
�2d0������1A ;
where the second term is easily seen to be oP ((logm)�2(log n)) = oP (1). By (34), the �rst term is
OP
0@sup�2�
������ 1mmXj=1
�(j=m)4d0
(1 + hj(d0;�))� (j=m)4d0
(1 + hj(d0;�0))
�������1A
+OP
0@sup�2�
������ 1mmXj=1
(hw(�w; �j)� hw(�w;0; �j)) (m=n)2d0 (log �j)2�j
m
�2d0������1A
= OP
0@ 1
m
mXj=1
�j
m
�4d0sup�2�
���� hj(d0;�0)� hj(d0;�)(1 + hj(d0;�)) (1 + hj(d0;�0))
����1A
+OP
0@(m=n)2d0 1m
mXj=1
(log �j)2
�j
m
�2d01A= OP
0@ 1
m
mXj=1
�j
m
�4d0�2d0j + (m=n)2d0(log n)2
1
m
mXj=1
�j
m
�2d01A= OP
�(m=n)2d0(log n)2
�;
using compactness of � and Lemma 4. This is oP (1) which proves part (c).
C.6 Proof of (d)
Again, we �rst prove the result for H1n which follows if
supd2Dm(�n);�2�
��� ~Ga;b;c(d;�)� ~Ga;b0;c(d0;�)��� = oP ((logm)�2) (35)
for a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d+ 1; : : : ; d+Rw +Ry; 2d; 2d+ 1; : : : ; 2d+ 2Rw. De�ning
~Ea;b;c(d;�) =1
m
mXj=1
j2dIz(�j)
(1 + hj(d;�))c+1
2 log j �
2hw(�w; �j)�2dj log �j
(1 + hj(d;�))
!a�j
m
�2b;
Ea;b(d;�) =1
m
mXj=1
j2dIz(�j)
(1 + hj(d;�))(2 log j)a
�j
m
�2b;
93
we need to show that, for all a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d + 1; : : : ; d + Rw + Ry; 2d; 2d +
1; : : : ; 2d+ 2Rw,
Za;b;c (�n) := supd2Dm(�n);�2�
��� ~Ea;b;c(d;�)� ~Ea;b0;c(d0;�)��� = oP (n2d0(logm)�2);
see also AS p. 600. Note that since b can take values including d, we distinguish between b and b0which are obviously the same in case b = 0; 1; : : : ; 2Ry. By the triangle inequality it is su¢ cient to
show that
supd2Dm(�n);�2�
��� ~Ea;b;c(d;�)� Ea;b(d;�)���+ supd2Dm(�n);�2�
���Ea;b(d;�)� Ea;b0(d0;�)���+ sup�2�
���Ea;b0(d0;�)� ~Ea;b0;c(d0;�)���
=: Z1;a;b;c(�n) + Z2;a;b;c(�n) + Z3;a;b0;c(�n) = oP (n2d0(logm)�2):
The result for Z3;a;b0;c(�n) follows from part (c) of the lemma since it does not depend on d.
For Z2;a;b;c(�n) we �nd that
Z2;a;b;c(�n)
= supd2Dm(�n);�2�
������ 1mmXj=1
"�j2dIz(�j)
1 + hj(d;�)
��j
m
�2b��j2d0Iz(�j)
1 + hj(d0;�)
��j
m
�2b0#(2 log j)a
������= sup
d2Dm(�n);�2�
������ 1mmXj=1
�j2d � j2d0
�Iz(�j)
1
1 + hj(d;�)
�j
m
�2b(2 log j)a
������+ supd2Dm(�n);�2�
������ 1mmXj=1
j2d0Iz(�j)
1 + hj(d;�)
1 + hj(d0;�)
�j
m
�2b0�2b� 1!
1
1 + hj(d;�)
�j
m
�2b(2 log j)a
������ :Since � is compact and 0 < d1 � d � d2 <1, for n su¢ ciently large it holds that
infd2[d1;d2];�2�j=1;:::;m
j1 + hj(d;�)j � c > 0; supb�0;j=1;:::;m
jj=mj2b = 1: (36)
Thus, the �rst term of Z2;a;b;c(�n) is bounded by
supd2Dm(�n)
������c�1 1mmXj=1
j2d0Iz(�j)(2 log j)a���j2d�2d0 � 1���
������ ;which is oP (n2d0(logm)�2) as in (A.18) of AS.
The second term of Z2;a;b;c(�n) is bounded by
supd2Dm(�n);�2�
������ 1mmXj=1
j2d0Iz(�j)
�1 + hj(d;�)
1 + hj(d0;�)� 1�
1
1 + hj(d;�)
�j
m
�2b(2 log j)a
������ (37)
+ supd2Dm(�n);�2�
������ 1mmXj=1
j2d0Iz(�j)1 + hj(d;�)
1 + hj(d0;�)
�j
m
�2b0��j
m
�2b! 1
1 + hj(d;�)(2 log j)a
������ ;(38)94
and using (36) and Lemma 4(iii) we �nd that (37) is
oP
0@ 1
m
mXj=1
j2d0Iz(�j)�2d1m (logm)a
1A = oP
0@ 1
m
mXj=1
�2d0j Iz(�j)� n2�
�2d0�2d1m (logm)a
1A :Noting thatm�1Pm
j=1 �2d0j Iz(�j) = G0;0(d0;0) = G(1+oP ((logm)
�2)); (37) is equal to oP (n2d0�2d1m (logm)a) =
oP (n2d0(logm)�2). By the mean value theorem, xa = xb + (a � b)xa�(log x) for a � a� � b which
implies that
supd2Dm(�n)
������j
m
�2b0��j
m
�2b����� = O
supd2Dm(�n)
(b0 � b) (logm)!= O
�(logm)�5�n(logm)
�:
Thus, applying also (36) and Lemma 4(iii), (38) is
OP
0@ 1
m
mXj=1
j2d0Iz(�j)(logm)a�4�n
1A = OP
��nn
2d0(logm)a�4G0;0(d0;0)�
= OP
��nn
2d0(logm)a�4�
= oP
�n2d0(logm)�2
�since �n = o (1) and a � 2.
Next, Z1;a;b;c(�n) is
supd2Dm(�n)�2�
������ 1mmXj=1
j2d0Iz(�j)
(1 + hj(d;�))j2d�2d0
"1
(1 + hj(d;�))c
2 log j �
2hw(�w; �j)�2dj log �j
(1 + hj(d;�))
!a� (2 log j)a
#�j
m
�2b������ :If a = 0 the result follows by (36), Lemma 4(iv), supd2Dm(�n);j=1;:::;m j
2d�2d0 = O(1), and 1m
Pmj=1 j
2d0Iz(�j) =
G0;0(d0;0)(2�=n)�2d0 = OP (n
2d0). When a � 1 we apply the mean value theorem as in the proof of
(25) such that 2 log j �
2hw(�w; �j)�2dj log �j
(1 + hj(d;�))
!a� (2 log j)a = O
�(log j)a�1�2dj (log n)
�(39)
uniformly in � 2 �. We then bound Z1;a;b;c(�n) as
supd2Dm(�n)�2�
������ 1mmXj=1
j2d0Iz(�j)
(1 + hj(d;�))j2d�2d0
" 2 log j �
2hw(�w; �j)�2dj log �j
(1 + hj(d;�))
!a� (2 log j)a
#�j
m
�2b������+ supd2Dm(�n)�2�
������ 1mmXj=1
j2d0Iz(�j)
(1 + hj(d;�))j2d�2d0
�1
(1 + hj(d;�))c� 1�
2 log j �2hw(�w; �j)�
2dj log �j
(1 + hj(d;�))
!a�j
m
�2b������ ;where the �rst term is G0;0(d0;0)(2�=n)�2d0OP ((logm)a�1�2d1m (log n)) = oP (n
2d0(logm)�2) by (36)
and (39) and the second term is G0;0(d0;0)(2�=n)�2d0oP (�2d1m (logm)a) = oP (n2d0(logm)�2) by (36),
(39), and Lemma 4(iv).
95
We proceed to show that supd2Dm(�n);�2�B�1n kH2n(d;�)�H2n(d0;�)kB�1n = oP (1) or equiva-
lently that supd2Dm(�n);�2� jVn(d;�)� Vn(d0;�)j = oP (1). Since we have shown (35) we have that
G(d;�)P! G uniformly in � 2 �; d 2 Dm(�n), so we need to show that the following is oP (1) :
supd2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d;�)� 1
m
mXk=1
Iz (�k)
gk (d;�)
!(j=m)2d
(1 + hj(d;�))qj(d;�)
� 1m
mXj=1
Iz (�j)
gj (d0;�)� 1
m
mXk=1
Iz (�k)
gk (d0;�)
!(j=m)2d0
(1 + hj(d0;�))qj(d0;�)
������� sup
d2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d;�)
qj(d;�)
(1 + hj(d;�))
�j
m
�2d� Iz (�j)
gj (d0;�)
qj(d0;�)
(1 + hj(d0;�))
�j
m
�2d0!������ (40)
+ supd2Dm(�n);�2�
������ 1mmXj=1
1
m
mXk=1
Iz (�k)
gk (d0;�)
qj(d0;�)
(1 + hj(d0;�))
�j
m
�2d0� Iz (�k)
gk (d;�)
qj(d;�)
(1 + hj(d;�))
�j
m
�2d!������ :(41)
By the triangle inequality we get the bounds
(40) � supd2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d0;�)
�j
m
�2d��j
m
�2d0! qj(d0;�)
(1 + hj(d0;�))
������ (42)
+ supd2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d0;�)
�j
m
�2d�gj (d0;�)gj (d;�)
� 1�
qj(d0;�)
(1 + hj(d0;�))
������ (43)
+ supd2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d;�)
�j
m
�2d� qj(d;�)
(1 + hj(d;�))� qj(d0;�)
(1 + hj(d0;�))
������� (44)
and
(41) � supd2Dm(�n);�2�
������ 1mmXj=1
�j
m
�2d0��j
m
�2d! qj(d0;�)
(1 + hj(d0;�))
1
m
mXk=1
Iz (�k)
gk (d0;�)
������ (45)
+ supd2Dm(�n);�2�
������ 1mmXj=1
�j
m
�2d qj(d0;�)
(1 + hj(d0;�))
1
m
mXk=1
Iz (�k)
gk (d0;�)
�1� gk (d0;�)
gk (d;�)
������� (46)
+ supd2Dm(�n);�2�
������ 1mmXj=1
�j
m
�2d� qj(d0;�)
(1 + hj(d0;�))� qj(d;�)
(1 + hj(d;�))
�1
m
mXk=1
Iz (�j)
gk (d;�)
������ :(47)The required results for (42) and (45) follow using the mean value theorem as in (38), whereas the
results for (43) and (46) follow as in (37). For (44) and (47) we note that, by inspection of the
de�nition of qj(d;�) in (27), c.f. (34), it is su¢ cient to show the result for
qj(d;�)
(1 + hj(d;�))� qj(d0;�)
(1 + hj(d0;�))=
(j=m)2d
(1 + hj(d;�))� (j=m)2d0
(1 + hj(d0;�))
=
�j
m
�2d� 1
(1 + hj(d;�))� 1
(1 + hj(d0;�))
�+
1
(1 + hj(d0;�))
�j
m
�2d��j
m
�2d0!:
96
Inserting this into (44) ((47) follows the same way) we get the bound
(44) � supd2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d;�)
�j
m
�4d� 1
(1 + hj(d;�))� 1
(1 + hj(d0;�))
�������+ supd2Dm(�n);�2�
������ 1mmXj=1
Iz (�j)
gj (d;�)
�j
m
�2d 1
(1 + hj(d0;�))
�j
m
�2d��j
m
�2d0!������ ;which we can handle similarly to (37) respectively (38).
Appendix D: Auxiliary lemmas
We now state two useful lemmas, which are used in the proofs of the main theorems. The �rst is stated
without proof and gathers some properties of the function hj(d;�), which all follow by compactness
of �.
Lemma 4 Let hj(d;�) = h(d;�;�j) =PRyr=1 �y;r�
2r + �2djPRwr=0 �w;r�
2r, 0 < d1 < d2 < 1, and let �
be compact. Then, as n!1 and for c = 0; 1; 2,
(i) sup�2� j(1 + hj(d0;�))c+1 � 1j = O(sup�2� hj(d0;�)) = O((j=n)2d0);(ii) inf
d2[d1;d2];�2�j=1;:::;m
j1 + hj (d;�)j = 1 + o(1) and sup�2�;j=1;:::;m jhj (d0;�0)� hj (d0;�)j = O(�2d0m );
(iii) supd2[d1;d2];�2�j=1;:::;m
��� 1+hj(d;�)1+hj(d0;�)� 1��� = O supd2[d1;d2];�2�
j=1;:::;m
���� �r+1(�2dj ��2d0j )
1+hj(d0;�)
����!= O(�2d1m );
(iv) supd2[d1;d2];�2�j=1;:::;m
j(1 + hj(d;�))c � 1j = O(supd2[d1;d2];�2�j=1;:::;m
hj(d;�)) = O(�2d1m ):
The next lemma provides approximations of the periodogram of zt by that of "t, following well
known results from, e.g., Robinson (1995a), Velasco (1999), AS, and Hurvich et al. (2005).
Lemma 5 Let Assumptions A1-A6 hold. Then, as n!1 and for all k = 1; : : : ;m,
kXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
�= OP
��k;n(d0) + k
'y+1n�'y + kd0+'w+1n�d0�'w + k1+2d0n�2d0 + k2d0n�d0(log k)�
and
kXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)� E
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
��= OP
��k;n(d0) + k
'y+1=2n�'y + k1=2+2d0n�2d0�;
where
�k;n(d) = k1=3(log k)2=3 + k1=2n�1=4
in the stationary case and
�k;n(d) = k1=(5�4d)(log k)2=(5�4d) + k2d�1(log k) + n�1=2k(1+d)=2(log n)5=4 + n�1=4kd(log k)1=2
in the nonstationary case.
97
Proof. Note that, in the nonstationary case, Hurvich et al. (2005) examine the di¤erence betweenthe normalized periodograms of zt and �yt (in our notation), whereas we examine the di¤erence
between the normalized periodograms of zt and yt itself in both the stationary and nonstationary
cases.
De�ne ~gj(d;�) = ��2dj G0 (1 + hy(�y; �j)) and write
kXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
�=
kXj=1
�Iz (�j)
gj (d0;�0)� Iy (�j)
~gj(d0;�0)
�(48)
+kXj=1
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)
�: (49)
In the stationary case (49) is OP (k1=3(log k)2=3+k'y+1n�'y+k1=2n�1=4) by (A.13)(i) of AS, and in the
nonstationary case (49) isOP (k1=(5�4d0)(log k)2=(5�4d0)+k'y+1n�'y+k2d0�1(log k)+n�1=2k(1+d0)=2(log n)5=4+
n�1=4kd0(log k)1=2) by slight modi�cation of Lemma 1 of Velasco (1999) to account for the better ap-
proximation of fy(�j) by ~gj(d0;�0) due to our polynomial appearing in ~gj(d0;�0) (the required modi-
�cation is the same as that used by AS to modify (4.8) of Robinson (1995a) to obtain their (A.13)(i)).
The term (48) is
Iz (�j)
gj (d0;�0)� Iy (�j)
~gj(d0;�0)=
~gj(d0;�0)� gj (d0;�0)gj (d0;�0)
�Iy (�j)
~gj(d0;�0)� 1�
(50)
+2phw(�w; �j)
p~gj(d0;�0)
gj (d0;�0)
Re (Iyw(�j))p~gj(d0;�0)
phw(�w; �j)
(51)
+Iw(�j) + ~gj(d0;�0)� gj (d0;�0)
gj (d0;�0); (52)
where Iab(�) = 12�n
Pnt=1
Pns=1 atbse
i(s�t)� denotes the cross-periodogram between the two series atand bt. Using summation by parts on (50) we �nd that
kXj=1
~gj(d0;�0)� gj (d0;�0)gj (d0;�0)
�Iy (�j)
~gj(d0;�0)� 1�
=k�1Xj=1
�~gj(d0;�0)� gj (d0;�0)
gj (d0;�0)� ~gj+1(d0;�0)� gj+1 (d0;�0)
gj+1 (d0;�0)
� jXl=1
�Iy (�l)
~gl(d0;�0)� 1�
+~gk(d0;�0)� gk (d0;�0)
gk (d0;�0)
kXj=1
�Iy (�j)
~gj(d0;�0)� 1�;
which is OP ((k=n)2d0(k1=3(log k)2=3 + k'y+1n�'y + k1=2n�1=4 + k1=2)) in the stationary case whereas
it is OP ((k=n)2d0(k1=(5�4d0)(log k)2=(5�4d0) + k'y+1n�'y + k2d0�1(log k) + n�1=2k(1+d0)=2(log n)5=4 +
n�1=4kd0(log k)1=2 + k1=2)) in the nonstationary case, by the same methods as applied previously
and using also (4.9) of Robinson (1995a) and that j~gj(d0;�0)=gj (d0;�0) � 1j � C(j=n)2d0 . Next,
(52) is easily seen to be OP ((j=n)2d0) because EjIw(�j)j = OP (1) uniformly in j = 1; : : : ;m. Since
fytg and fwtg are independent (51) is OP ((j=n)d0(j�1(log j) + (j=n)min('y ;'w))) in the stationarycase by Theorem 2 of Robinson (1995b), yielding a contribution to (48) of OP ((k=n)d0((log k) +
k1+min('y ;'w)n�min('y ;'w))). In the nonstationary case we use Theorem 1 of Velasco (1999) which
98
shows that Re(Iyw(�j))j~gj (d0;�0) j�1=2jhw(�w; �j)j�1=2 = OP ((j2d0�2(log j)+ (j=n)min('y ;'w))), yield-ing a contribution to (48) of OP ((k=n)d0(kd0(log k) + k1+min('y ;'w)n�min('y ;'w)) (Velasco�s result has
to be modi�ed to accommodate multivariate time series, but the modi�cation is simple by comparing
e.g. his equation (A.1) with equation (4.3) of Robinson (1995b), see also the second to last equation
on p. 108 of Velasco (1999)). The di¤erence in the remainder terms relative to Robinson (1995b) and
Velasco (1999) is due to the di¤erent remainder term in the approximation of fy(�j) by ~gj (d0;�0) due
to our polynomial appearing in ~gj (d0;�0).
To prove the second result we write
kXj=1
�Iz (�j)
gj (d0;�0)� 2�I" (�j)� E
�Iz (�j)
gj (d0;�0)� 2�I" (�j)
��
=kXj=1
�Iz (�j)
gj (d0;�0)� Iy (�j)
~gj(d0;�0)� E
�Iz (�j)
gj (d0;�0)� Iy (�j)
~gj(d0;�0)
��(53)
+kXj=1
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)� E
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)
��: (54)
By (A.21) of AS, (54) is OP (k1=3(log k)2=3 + k'y+1=2n�'y + k1=2n�1=4) in the stationary case, and by
(slight modi�cation of) Lemma 1 of Velasco (1999), (54) isOP (k1=(5�4d0)(log k)2=(5�4d0)+k'y+1=2n�'y+
k2d0�1(log k) + n�1=2k(1+d0)=2(log n)5=4 + n�1=4kd0(log k)1=2) in the nonstationary case. For eq. (53)
we write
Iz (�j)
gj (d0;�0)� Iy (�j)
~gj(d0;�0)� E
�Iz (�j)
gj (d0;�0)� Iy (�j)
~gj(d0;�0)
�=
~gj(d0;�0)� gj (d0;�0)gj (d0;�0)
��Iy (�j)
~gj(d0;�0)� 2�I" (�j)
�� E
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)
��(55)
+~gj(d0;�0)� gj (d0;�0)
gj (d0;�0)(2�I" (�j)� 1) (56)
+2p~gj(d0;�0)
gj (d0;�0)
Re (Iyw(�j)� EIyw(�j))p~gj(d0;�0)
(57)
+hw(�w;0; �j)
gj (d0;�0)
��Iw (�j)
hw(�w;0; �j)� 2�I�(�j)
�� E
�Iw (�j)
hw(�w;0; �j)� 2�I�(�j)
��(58)
+hw(�w;0; �j)
gj (d0;�0)(2�I�(�j)� 1) ; (59)
using also that hw(�w;0; �j) = gj (d0;�0)� ~gj(d0;�0).
99
Using summation by parts we �nd that (59) is
kXj=1
hw(�w;0; �j)
gj (d0;�0)(2�I�(�j)� 1) =
k�1Xj=1
�hw(�w;0; �j)
gj (d0;�0)� hw(�w;0; �j+1)
gj+1 (d0;�0)
� jXl=1
(2�I�(�l)� 1)
+hw(�w;0; �k)
gk (d0;�0)
kXj=1
(2�I�(�j)� 1)
=
k�1Xj=1
����hw(�w;0; �j)gj+1 (d0;�0)� hw(�w;0; �j+1)gj (d0;�0)gj (d0;�0) gj+1 (d0;�0)
����OP (j1=2)+hw(�w;0; �k)
gk (d0;�0)OP (k
1=2)
= OP
0@k�1Xj=1
j2d0�1=2n�2d0
1A+OP (k1=2+2d0n�2d0)= OP (k
1=2+2d0n�2d0);
using (4.9) of Robinson (1995a) for the second equality. The term (56) is handled in exactly the same
way yielding the same contribution. For the term (57) we can split it up in the same way as (58) and
(59), and the contribution is the same.
Using summation by parts on (55) we �nd that, in the stationary case,
kXj=1
~gj(d0;�0)� gj (d0;�0)gj (d0;�0)
��Iy (�j)
~gj(d0;�0)� 2�I" (�j)
�� E
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)
��
=k�1Xj=1
�~gj(d0;�0)� gj (d0;�0)
gj (d0;�0)� ~gj+1(d0;�0)� gj+1 (d0;�0)
gj+1 (d0;�0)
�
�jXl=1
��Iy (�l)
~gl(d0;�0)� 2�I" (�l)
�� E
�Iy (�l)
~gl(d0;�0)� 2�I" (�l)
��
+~gk(d0;�0)� gk (d0;�0)
gk (d0;�0)
kXj=1
��Iy (�j)
~gj(d0;�0)� 2�I" (�j)
�� E
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)
��= OP
�(k=n)2d0(k1=3(log k)2=3 + k'y+1=2n�'y + k1=2n�1=4)
�using (A.21) of AS. In the nonstationary case we use Lemma 1 of Velasco (1999) and get
kXj=1
~gj(d0;�0)� gj (d0;�0)gj (d0;�0)
��Iy (�j)
~gj(d0;�0)� 2�I" (�j)
�� E
�Iy (�j)
~gj(d0;�0)� 2�I" (�j)
��= OP ((k=n)
2d0(k1=(5�4d0)(log k)2=(5�4d0) + k'y+1=2n�'y
+k2d0�1(log k) + n�1=2k(1+d0)=2(log n)5=4 + n�1=4kd0(log k)1=2)):
Finally the term (58) is handled in exactly the same way as the stationary case of (55) yielding the
contribution OP�(k=n)2d0(k1=3(log k)2=3 + k'w+1=2n�'w + k1=2n�1=4)
�.
100
References
Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), �The distribution of realized stock
return volatility�, Journal of Financial Economics 61, 43�76.
Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), �The distribution of realized
exchange rate volatility�, Journal of the American Statistical Association 96, 42�55.
Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2003), �Modelling and forecasting realized
volatility�, Econometrica 71, 579�625.
Andrews, D. W. K. & Guggenberger, P. (2003), �A bias-reduced log-periodogram regression estimator
for the long memory parameter�, Econometrica 71, 675�712.
Andrews, D. W. K. & Sun, Y. (2004), �Adaptive local polynomial Whittle estimation of long-range
dependence�, Econometrica 72, 569�614.
Arteche, J. (2004), �Gaussian semiparametric estimation in long memory in stochastic volatility and
signal plus noise models�, Journal of Econometrics 119, 131�154.
Baillie, R. T. (1996), �Long memory processes and fractional integration in econometrics�, Journal of
Econometrics 73, 5�59.
Baillie, R. T., Bollerslev, T. & Mikkelsen, H. O. (1996), �Fractionally integrated generalized autore-
gressive conditional heteroscedasticity�, Journal of Econometrics 74, 3�30.
Beran, J. (1994), Statistics for Long-Memory Processes, Chapman-Hall, New York.
Bollerslev, T. & Jubinski, D. (1999), �Equity trading volume and volatility: Latent information arrivals
and common long-run dependencies�, Journal of Business and Economic Statistics 17, 9�21.
Breidt, F. J., Crato, N. & de Lima, P. (1998), �The detection and estimation of long memory in
stochastic volatility�, Journal of Econometrics 83, 325�348.
Comte, F. & Renault, E. (1998), �Long memory in continuous-time stochastic volatility models�,
Mathematical Finance 8, 291�323.
Davies, R. B. & Harte, D. S. (1987), �Tests for hurst e¤ects�, Biometrika 74, 95�102.
Deo, R. S. & Hurvich, C. M. (2001), �On the log periodogram regression estimator of the memory
parameter in long memory stochastic volatility models�, Econometric Theory 17, 686�710.
Ding, Z., Granger, C. W. J. & Engle, R. F. (1993), �A long memory property of stock returns and a
new model�, Journal of Empirical Finance 1, 83�106.
Fox, R. & Taqqu, M. S. (1986), �Large-sample properties of parameter estimates for strongly dependent
stationary Gaussian series�, Journal of Time Series Analysis 4, 221�238.
Frederiksen, P. H. & Nielsen, M. Ø. (2008), �Bias-reduced estimation of long memory stochastic
volatility�, Forthcoming in Journal of Financial Econometrics .
101
Fuller, W. A. (1996), Introduction to statistical time series, Wiley, New York.
Geweke, J. & Porter-Hudak, S. (1983), �The estimation and application of long-memory time series
models�, Journal of Time Series Analysis 4, 221�238.
Harvey, A. (1989), Forecasting, structural time series models and the Kalman �lter, Cambridge Uni-
versity Press, Cambridge.
Harvey, A. (1998), Long memory in stochastic volatility, in J. Knight & S. Satchell, eds, �Forecasting
Volatility in Financial Markets�, Butterworth-Heinemann, London, pp. 307�320.
Henry, M. & Za¤aroni, P. (2003), The long range paradigm for macroeconomics and �nance, in
P. Doukhan, G. Oppenheim & M. S. Taqqu, eds, �Theory and Applications of Long-Range De-
pendence�, Birkhäuser, Boston, pp. 417�438.
Hurvich, C. M., Moulines, E. & Soulier, P. (2005), �Estimating long memory in volatility�, Economet-
rica 73, 1283�1328.
Hurvich, C. M. & Ray, B. K. (1995), �Estimation of the memory parameter for nonstationary or
noninvertible fractionally integrated processes�, Journal of Time Series Analysis 16, 17�41.
Hurvich, C. M. & Ray, B. K. (2003), �The local Whittle estimator of long-memory stochastic volatility�,
Journal of Financial Econometrics 1, 445�470.
Künsch, H. R. (1987), Statistical aspects of self-similar processes, in Y. Prokhorov & V. V. Sazanov,
eds, �Proceedings of the First World Congress of the Bernoulli Society�, VNU Science Press,
Utrecht, pp. 67�74.
Liesenfeld, R. (2001), �A generalized bivariate mixture model for stock price volatility and trading
volume�, Journal of Econometrics 104, 141�178.
Marinucci, D. & Robinson, P. M. (1999), �Alternative forms of fractional brownian motion�, Journal
of Statistical Planning and Inference 80, 111�122.
Ray, B. K. & Tsay, R. (2000), �Long-range dependence in daily stock volatility�, Journal of Business
and Economic Statistics 18, 254�262.
Robinson, P. M. (1995a), �Gaussian semiparametric estimation of long range dependence�, The Annals
of Statistics 23, 1630�1661.
Robinson, P. M. (1995b), �Log-periodogram regression of time series with long range dependence�, The
Annals of Statistics 23, 1048�1072.
Robinson, P. M. (2003), Long-memory time series, in P. M. Robinson, ed., �Time Series With Long
Memory�, Oxford University Press, Oxford, pp. 4�32.
Sun, Y. & Phillips, P. C. B. (2003), �Nonlinear log-periodogram regression for perturbed fractional
processes�, Journal of Econometrics 115, 355�389.
102
Velasco, C. (1999), �Gaussian semiparametric estimation of non-stationary time series�, Journal of
Time Series Analysis 20, 87�127.
Wright, J. H. (2002), �Log periodogram estimation of long memory volatility dependencies with con-
ditionally heavy tailed returns�, Econometric Reviews 21, 397�417.
103
Table 1: Simulation results for Model ILWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 0.0109 0.2037 0.0184 0.2707 0.0124 0.2606 0.0135 0.2917
4096 -0.0040 0.1280 0.0135 0.2093 -0.0029 0.1791 0.0085 0.22048192 0.0061 0.0911 0.0128 0.1479 0.0026 0.1205 0.0103 0.1553
10 2048 0.0140 0.2639 0.0166 0.3106 0.0264 0.3123 0.0207 0.32714096 0.0035 0.1793 0.0146 0.2462 0.0020 0.2268 0.0212 0.26068192 0.0019 0.1194 0.0032 0.1570 0.0191 0.1901 0.0161 0.1947
20 2048 0.0004 0.3373 -0.0348 0.3391 -0.0097 0.3567 -0.0253 0.35244096 -0.0005 0.2474 0.0003 0.2922 -0.0001 0.2840 0.0026 0.30238192 -0.0047 0.2175 -0.0009 0.2392 0.0003 0.2380 -0.0001 0.2419
Panel B: m =�n0:8
�5 2048 -0.0002 0.1567 0.0002 0.2154 -0.0081 0.1953 -0.0139 0.2279
4096 0.0015 0.0966 0.0076 0.1506 -0.0020 0.1233 0.0054 0.17598192 0.0054 0.0706 0.0075 0.1025 0.0044 0.0907 0.0082 0.1244
10 2048 0.0057 0.2276 0.0056 0.2777 0.0094 0.2738 -0.0224 0.27354096 0.0078 0.1399 0.0155 0.1930 0.0089 0.1774 -0.0095 0.19928192 0.0047 0.0917 0.0125 0.1410 0.0034 0.1177 0.0002 0.1385
20 2048 -0.0152 0.3011 -0.0549 0.3058 -0.0294 0.3212 -0.1062 0.29974096 0.0002 0.2201 -0.0055 0.2531 -0.0020 0.2518 -0.0445 0.23668192 0.0073 0.1361 0.0146 0.1768 0.0073 0.1629 -0.0230 0.1629
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
Table 2: Simulation results for Model II with (�y; �y) = (0; 0) and (�x; �x) = (0:5; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 -0.0620 0.1977 0.0136 0.2678 -0.0045 0.2510 0.0042 0.2885
4096 -0.0528 0.1346 0.0134 0.2116 -0.0131 0.1738 0.0080 0.21828192 -0.0228 0.0931 0.0174 0.1523 0.0005 0.1250 0.0126 0.1595
10 2048 -0.0937 0.2484 0.0041 0.2968 0.0044 0.3052 0.0167 0.32634096 -0.0661 0.1818 0.0170 0.2449 -0.0040 0.2235 0.0235 0.26538192 -0.0416 0.1191 0.0144 0.1873 -0.0048 0.1600 0.0093 0.1906
20 2048 -0.1102 0.3165 -0.0479 0.3250 -0.0186 0.3525 -0.0271 0.35664096 -0.1043 0.2482 -0.0071 0.2815 -0.0126 0.2819 0.0056 0.30048192 -0.0613 0.1686 0.0102 0.2222 -0.0120 0.1966 0.0108 0.2248
Panel B: m =�n0:8
�5 2048 -0.1486 0.1869 -0.0524 0.2245 -0.0896 0.1881 -0.0381 0.2089
4096 -0.1275 0.1519 -0.0116 0.1787 -0.0612 0.1319 -0.0023 0.17158192 -0.1027 0.1206 -0.0049 0.1175 -0.0371 0.0944 0.0174 0.1309
10 2048 -0.2208 0.2570 -0.0709 0.2732 -0.1102 0.2537 -0.0693 0.25754096 -0.1870 0.2138 -0.0149 0.2220 -0.0848 0.1736 -0.0260 0.19838192 -0.1610 0.1806 -0.0100 0.1617 -0.0649 0.1284 0.0005 0.1525
20 2048 -0.2748 0.3104 -0.1417 0.2999 -0.1656 0.3127 -0.1423 0.29134096 -0.2640 0.2919 -0.0583 0.2633 -0.1195 0.2556 -0.0728 0.24278192 -0.2349 0.2554 -0.0155 0.2028 -0.0894 0.1800 -0.0252 0.1826
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
104
Table 3: Simulation results for Model II with (�y; �y) = (0; 0) and (�x; �x) = (�0:8; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 0.0133 0.2036 0.0205 0.2735 0.0138 0.2645 0.0126 0.2916
4096 -0.0012 0.1285 0.0155 0.2072 0.0018 0.1825 0.0076 0.21898192 0.0066 0.0904 0.0131 0.1501 0.0024 0.1214 0.0088 0.1564
10 2048 0.0149 0.2689 0.0156 0.3092 0.0241 0.3120 0.0179 0.33134096 0.0046 0.1797 0.0144 0.2445 0.0012 0.2234 0.0227 0.25978192 0.0037 0.1141 0.0076 0.1837 -0.0029 0.1575 0.0047 0.1899
20 2048 -0.0001 0.3340 -0.0329 0.3351 -0.0157 0.3509 -0.0213 0.35204096 0.0032 0.2510 -0.0049 0.2858 -0.0025 0.2759 0.0048 0.29988192 0.0067 0.1619 0.0095 0.2223 -0.0086 0.1973 0.0062 0.2258
Panel B: m =�n0:8
�5 2048 0.0139 0.1595 -0.0034 0.2189 -0.0082 0.1978 -0.0137 0.2357
4096 0.0114 0.0972 0.0055 0.1509 -0.0019 0.1256 0.0070 0.17828192 0.0116 0.0716 0.0074 0.1028 0.0053 0.0903 0.0086 0.1257
10 2048 0.0291 0.2328 0.0065 0.2787 0.0092 0.2746 -0.0285 0.27534096 0.0210 0.1404 0.0116 0.1936 0.0092 0.1799 -0.0048 0.20128192 0.0046 0.0939 -0.0093 0.1324 -0.0053 0.1152 -0.0083 0.1398
20 2048 0.0060 0.3059 -0.0564 0.3084 -0.0331 0.3239 -0.1075 0.29984096 0.0168 0.2232 -0.0132 0.2476 -0.0049 0.2478 -0.0493 0.23588192 0.0119 0.1428 0.0045 0.1805 0.0005 0.1688 -0.0222 0.1704
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
Table 4: Simulation results for Model III with (�y; �y) = (0; 0) and (�x; �x) = (0; 0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 -0.0035 0.1535 0.0247 0.2290 0.0059 0.2064 0.0103 0.2503
4096 -0.0083 0.0989 0.0095 0.1718 -0.0056 0.1410 0.0012 0.18588192 0.0025 0.0719 0.0106 0.1180 0.0036 0.0990 0.0068 0.1340
10 2048 -0.0078 0.1848 0.0257 0.2653 0.0103 0.2462 0.0196 0.28284096 -0.0064 0.1248 0.0114 0.1958 0.0008 0.1727 0.0097 0.21238192 -0.0027 0.0867 0.0111 0.1453 0.0010 0.1200 0.0082 0.1564
20 2048 0.0006 0.2585 0.0118 0.3011 0.0196 0.3071 0.0194 0.32774096 -0.0164 0.1666 0.0132 0.2412 -0.0000 0.2184 0.0123 0.25528192 -0.0094 0.1141 0.0108 0.1776 -0.0034 0.1480 0.0093 0.1835
Panel B: m =�n0:8
�5 2048 -0.0396 0.1124 0.0051 0.1665 -0.0115 0.1366 -0.0084 0.1765
4096 -0.0266 0.0758 0.0110 0.1151 -0.0017 0.0936 0.0056 0.14358192 -0.0141 0.0547 0.0101 0.0808 0.0035 0.0701 0.0081 0.1042
10 2048 -0.0670 0.1585 0.0146 0.2235 -0.0071 0.1932 -0.0064 0.21744096 -0.0371 0.0967 0.0226 0.1489 0.0004 0.1118 0.0020 0.16418192 -0.0246 0.0695 0.0127 0.0998 0.0016 0.0830 0.0069 0.1217
20 2048 -0.0890 0.2183 -0.0099 0.2542 -0.0136 0.2503 -0.0406 0.25334096 -0.0597 0.1479 0.0233 0.1955 0.0018 0.1701 -0.0029 0.20088192 -0.0434 0.0935 0.0174 0.1341 -0.0022 0.1077 0.0032 0.1431
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
105
Table 5: Simulation results for Model III with(�y; �y) = (0; 0) and (�x; �x) = (0;�0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 0.0144 0.1427 0.0188 0.2184 0.0045 0.1921 0.0095 0.2345
4096 0.0043 0.0917 0.0083 0.1621 -0.0036 0.1329 0.0034 0.18208192 0.0079 0.0673 0.0085 0.1068 0.0035 0.0913 0.0074 0.1306
10 2048 0.0125 0.1713 0.0257 0.2513 0.0120 0.2291 0.0135 0.26264096 0.0083 0.1159 0.0075 0.1831 -0.0031 0.1584 0.0056 0.19448192 0.0065 0.0821 0.0085 0.1329 0.0010 0.1118 0.0046 0.1467
20 2048 0.0333 0.2349 0.0220 0.2936 0.0204 0.2831 0.0199 0.30664096 0.0057 0.1470 0.0128 0.2208 0.0002 0.1940 0.0116 0.23808192 0.0045 0.1041 0.0018 0.1508 -0.0044 0.1293 -0.0029 0.1585
Panel B: m =�n0:8
�5 2048 0.0536 0.1144 -0.0099 0.1469 -0.0062 0.1293 0.0045 0.1891
4096 0.0409 0.0785 -0.0031 0.0971 0.0028 0.0891 0.0067 0.14158192 0.0319 0.0591 0.0029 0.0719 0.0074 0.0656 0.0074 0.0982
10 2048 0.0780 0.1555 -0.0036 0.1958 0.0005 0.1727 0.0042 0.22224096 0.0602 0.1003 -0.0024 0.1208 0.0048 0.1044 0.0099 0.15828192 0.0415 0.0715 -0.0044 0.0838 0.0024 0.0765 0.0064 0.1124
20 2048 0.1203 0.2265 -0.0163 0.2287 0.0076 0.2305 -0.0094 0.24364096 0.0839 0.1455 -0.0021 0.1643 0.0092 0.1522 0.0088 0.18538192 0.0529 0.0913 -0.0078 0.1031 0.0013 0.0952 -0.0003 0.1331
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
Table 6: Simulation results for Model IV with (�y; �y) = (�0:8; 0) and (�x; �x) = (0:5; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 -0.0579 0.1987 0.0167 0.2710 -0.0015 0.2515 0.0077 0.2867
4096 -0.0498 0.1344 0.0150 0.2139 -0.0117 0.1766 0.0086 0.21868192 -0.0207 0.0929 0.0177 0.1535 0.0007 0.1250 0.0067 0.1928
10 2048 -0.0900 0.2488 0.0029 0.2985 0.0051 0.3061 0.0175 0.32824096 -0.0635 0.1812 0.0191 0.2492 -0.0021 0.2281 0.0242 0.26478192 -0.0400 0.1186 0.0144 0.1859 -0.0047 0.1605 0.0106 0.1893
20 2048 -0.1076 0.3173 -0.0469 0.3274 -0.0186 0.3515 -0.0240 0.35694096 -0.1017 0.2469 -0.0080 0.2827 -0.0108 0.2814 0.0038 0.30408192 -0.0600 0.1685 0.0117 0.2239 -0.0121 0.1966 -0.0090 0.2250
Panel B: m =�n0:8
�5 2048 -0.1302 0.1759 -0.0543 0.2222 -0.0871 0.1894 -0.0439 0.2122
4096 -0.1135 0.1414 -0.0153 0.1745 -0.0592 0.1324 -0.0006 0.17248192 -0.0927 0.1125 -0.0062 0.1171 -0.0353 0.0942 0.0182 0.1316
10 2048 -0.2081 0.2491 -0.0619 0.2745 -0.1078 0.2542 -0.0682 0.25594096 -0.1767 0.2052 -0.0157 0.2236 -0.0820 0.1731 -0.0259 0.19748192 -0.1540 0.1747 -0.0110 0.1623 -0.0638 0.1275 -0.0001 0.1538
20 2048 -0.2699 0.3086 -0.1392 0.3017 -0.1656 0.3142 -0.1452 0.29194096 -0.2587 0.2882 -0.0602 0.2637 -0.1175 0.2540 -0.0716 0.24168192 -0.2297 0.2507 -0.0188 0.2041 -0.0873 0.1820 -0.0225 0.1838
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
106
Table 7: Simulation results for Model IV with (�y; �y) = (�0:8; 0) and (�x; �x) = (�0:8; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 0.0175 0.2051 0.0226 0.2725 0.0156 0.2640 0.0122 0.2922
4096 0.0019 0.1289 0.0158 0.2095 0.0006 0.1838 0.0073 0.22228192 0.0086 0.0911 0.0112 0.1455 0.0033 0.1206 0.0112 0.1568
10 2048 0.0184 0.2689 0.0156 0.3090 0.0243 0.3121 0.0188 0.32924096 0.0068 0.1803 0.0149 0.2459 0.0025 0.2209 0.0232 0.26178192 0.0052 0.1444 0.0090 0.1866 -0.0024 0.1582 0.0068 0.1874
20 2048 0.0021 0.3350 -0.0341 0.3342 -0.0170 0.3487 -0.0242 0.35234096 0.0052 0.2520 -0.0047 0.2861 0.0016 0.2797 0.0051 0.30118192 0.0077 0.1622 0.0046 0.2261 -0.0093 0.1947 0.0095 0.2292
Panel B: m =�n0:8
�5 2048 0.0329 0.1646 -0.0054 0.2180 -0.0058 0.2001 -0.0116 0.2314
4096 0.0248 0.1004 0.0032 0.1486 -0.0006 0.1259 0.0090 0.18058192 0.0205 0.0740 0.0069 0.1028 0.0063 0.0904 0.0091 0.1262
10 2048 0.0439 0.2369 0.0062 0.2805 0.0105 0.2754 -0.0237 0.27224096 0.0308 0.1431 0.0098 0.1916 0.0093 0.1790 -0.0036 0.19748192 0.0133 0.0946 -0.0012 0.1327 -0.0043 0.1168 -0.0063 0.1415
20 2048 0.0116 0.3081 -0.0567 0.3087 -0.0321 0.3241 -0.1091 0.29894096 0.0218 0.2246 -0.0142 0.2478 -0.0053 0.2485 -0.0472 0.23778192 0.0170 0.1435 0.0031 0.1710 0.0029 0.1714 -0.0221 0.1710
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
Table 8: Simulation results for Model V with (�y; �y) = (0;�0:8) and (�x; �x) = (0; 0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 0.3067 0.3790 0.0179 0.2211 0.0883 0.2694 0.0270 0.2585
4096 0.2261 0.2608 0.0037 0.1550 0.0492 0.1735 0.0133 0.19558192 0.1655 0.1864 0.0130 0.1087 0.0561 0.1255 0.0117 0.1382
10 2048 0.2710 0.3675 0.0130 0.2572 0.0701 0.2928 0.0250 0.28334096 0.2040 0.2525 0.0119 0.1846 0.0470 0.1894 0.0183 0.21688192 0.1357 0.1681 0.0113 0.1307 0.0406 0.1319 0.0110 0.1542
20 2048 0.1837 0.3620 0.0203 0.3127 0.0539 0.3298 0.0355 0.32634096 0.1635 0.2571 0.0219 0.2473 0.0408 0.2378 0.0286 0.26588192 0.1057 0.1636 0.0074 0.1640 0.0222 0.1541 0.0151 0.1790
Panel B: m =�n0:8
�5 2048 -0.2596 0.4143 -0.0444 0.1989 0.1301 0.3058 0.0208 0.2102
4096 0.2995� 0.4583� 0.0112 0.1083 0.1513 0.2153 0.0200 0.14768192 0.3611 0.4097 0.0151 0.0686 0.1221 0.1514 0.0124 0.1089
10 2048 -0.2123 0.4204 -0.0190 0.1988 0.1460 0.3243 0.0265 0.23304096 0.2645� 0.4359� 0.0167 0.1044 0.1464 0.2112 0.0208 0.15298192 0.3123 0.3751 0.0136 0.0784 0.1095 0.1473 0.0156 0.1140
20 2048 -0.1827 0.4107 -0.0063 0.2145 0.1267 0.3235 -0.0173 0.25694096 0.1994� 0.4026� 0.0168 0.1477 0.1245 0.2348 0.0164 0.19428192 0.2554 0.3329 0.0175 0.1033 0.0990 0.1659 0.0215 0.1405
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
107
Table 9: Simulation results for Model V with (�y; �y) = (0;�0:8) and (�x; �x) = (0;�0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)
nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =
�n0:7
�5 2048 0.3311 0.3899 0.0103 0.2054 0.0860 0.2616 0.0248 0.2486
4096 0.2452 0.2754 0.0074 0.1487 0.0597 0.1721 0.0108 0.18878192 0.1687 0.1876 0.0086 0.1016 0.0574 0.1213 0.0059 0.1305
10 2048 0.2993 0.3815 0.0073 0.2410 0.0736 0.2789 0.0266 0.27444096 0.2229 0.2628 0.0036 0.1714 0.0440 0.1835 0.0209 0.20628192 0.1579 0.1822 0.0144 0.1186 0.0504 0.1278 0.0167 0.1410
20 2048 0.2224 0.3741 0.0138 0.2809 0.0684 0.3174 0.0321 0.30664096 0.1989 0.2618 0.0214 0.2085 0.0532 0.2115 0.0298 0.23398192 0.1388 0.1764 0.0142 0.1524 0.0380 0.1461 0.0186 0.1655
Panel B: m =�n0:8
�5 2048 -0.2917 0.4077 -0.0846 0.2273 0.0918 0.3189 0.0174 0.2027
4096 0.2863� 0.4677� -0.0021 0.1165 0.1491 0.2215 0.0212 0.14318192 0.3540 0.4284 0.0080 0.0645 0.1234 0.1510 0.0081 0.1032
10 2048 -0.2717 0.4115 -0.0677 0.2208 0.1099 0.3330 0.0313 0.23114096 0.2509� 0.4533� -0.0015 0.1216 0.1394 0.2185 0.0174 0.15538192 0.3026 0.4056 0.0066 0.0696 0.1139 0.1471 0.0098 0.1123
20 2048 -0.2992 0.4025 -0.0528 0.2180 0.1012 0.3235 0.0004 0.24274096 0.1423� 0.4261� -0.0104 0.1397 0.1020 0.2170 -0.0166 0.15878192 0.2319 0.3805 0.0014 0.0813 0.1010 0.1509 0.0136 0.1201
Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).
108
Table 10: Local Whittle estimation of long memory in volatility of DJIA stocksm =
�n0:6
�m =
�n0:7
�m =
�n0:8
�Ticker Symbol LW LPW LWN LW LPW LWN LW LPW LWN
AA 0:3002(0:0395)
0:3718(0:0592)
0:5292(0:0768)
0:2019(0:0258)
0:2956(0:0387)
0:5916(0:0476)
0:1379(0:0169)
0:1977(0:0253)
0:6063(0:0308)
AIG 0:3696(0:0395)
0:5017(0:0592)
0:6793(0:0686)
0:2938(0:0258)
0:3941(0:0387)
0:6202(0:0466)
0:2042(0:0169)
0:2990(0:0253)
0:6471(0:0299)
AXP 0:3691(0:0395)
0:4860(0:0592)
0:8225(0:0635)
0:3260(0:0258)
0:3928(0:0387)
0:5552(0:0490)
0:2115(0:0169)
0:3088(0:0253)
0:6514(0:0298)
BA 0:2593(0:0395)
0:4087(0:0592)
0:6809(0:0685)
0:2094(0:0258)
0:2484(0:0387)
0:5870(0:0478)
0:1509(0:0169)
0:2065(0:0253)
0:5336(0:0327)
C 0:3853(0:0395)
0:4789(0:0592)
0:7273(0:0666)
0:2908(0:0258)
0:3855(0:0387)
0:6677(0:0451)
0:2141(0:0169)
0:2992(0:0253)
0:6309(0:0302)
CAT 0:2477(0:0395)
0:3364(0:0592)
0:6247(0:0711)
0:1915(0:0258)
0:3053(0:0387)
0:5522(0:0491)
0:1280(0:0169)
0:2022(0:0253)
0:5788(0:0315)
DD 0:1425(0:0395)
0:1738(0:0592)
0:4238(0:0861)
0:1008(0:0258)
0:1292(0:0387)
0:4195(0:0565)
0:0810(0:0169)
0:0956(0:0253)
0:3366(0:0420)
DIS 0:3033(0:0395)
0:4448(0:0592)
0:9074(0:0613)
0:2361(0:0258)
0:3155(0:0387)
0:7582(0:0428)
0:1744(0:0169)
0:2134(0:0253)
0:6824(0:0292)
GE 0:3615(0:0395)
0:5044(0:0592)
0:7528(0:0657)
0:2497(0:0258)
0:3659(0:0387)
0:7796(0:0423)
0:1807(0:0169)
0:2580(0:0253)
0:7546(0:0281)
GM 0:2567(0:0483)
0:3489(0:0592)
0:4949(0:0794)
0:1987(0:0258)
0:2606(0:0387)
0:4890(0:0522)
0:1603(0:0169)
0:2027(0:0253)
0:4091(0:0375)
HD 0:3703(0:0395)
0:4581(0:0592)
0:6782(0:0686)
0:2489(0:0258)
0:3670(0:0387)
0:7321(0:0434)
0:1723(0:0169)
0:2432(0:0253)
0:7401(0:0283)
HON 0:2614(0:0395)
0:3354(0:0592)
0:9898(0:0594)
0:2323(0:0258)
0:2447(0:0387)
0:5859(0:0478)
0:1787(0:0169)
0:2253(0:0253)
0:4242(0:0368)
HPQ 0:3503(0:0395)
0:4688(0:0592)
0:8591(0:0625)
0:2366(0:0258)
0:3049(0:0387)
0:9061(0:0400)
0:1845(0:0169)
0:2290(0:0253)
0:7583(0:0280)
IBM 0:3417(0:0395)
0:4778(0:0592)
0:7626(0:0654)
0:2653(0:0258)
0:3295(0:0387)
0:6922(0:0444)
0:1931(0:0169)
0:2638(0:0253)
0:6359(0:0301)
INTC 0:3467(0:0395)
0:4755(0:0592)
0:7436(0:0661)
0:2396(0:0258)
0:3325(0:0387)
0:7685(0:0426)
0:1807(0:0169)
0:2532(0:0253)
0:6894(0:0291)
JNJ 0:3734(0:0395)
0:4400(0:0592)
0:6639(0:0692)
0:2601(0:0258)
0:3750(0:0387)
0:6850(0:0446)
0:1940(0:0169)
0:2641(0:0253)
0:6394(0:0301)
JPM 0:3603(0:0395)
0:5424(0:0592)
0:7173(0:0670)
0:3032(0:0258)
0:3741(0:0387)
0:6029(0:0472)
0:2174(0:0169)
0:2865(0:0253)
0:6058(0:0308)
KO 0:3677(0:0395)
0:5028(0:0592)
0:8104(0:0639)
0:2584(0:0258)
0:3653(0:0387)
0:8065(0:0418)
0:1833(0:0169)
0:2506(0:0253)
0:7923(0:0275)
MCD 0:2640(0:0395)
0:4591(0:0592)
0:6632(0:0693)
0:1798(0:0258)
0:2513(0:0387)
0:6936(0:0444)
0:1170(0:0169)
0:1701(0:0253)
0:7116(0:0287)
MMM 0:2635(0:0395)
0:3792(0:0592)
0:9891(0:0595)
0:2016(0:0258)
0:2744(0:0387)
0:9899(0:0388)
0:1430(0:0169)
0:1944(0:0253)
0:8712(0:0266)
MO 0:3041(0:0395)
0:4106(0:0592)
0:7409(0:0662)
0:2531(0:0258)
0:3152(0:0387)
0:5484(0:0493)
0:1879(0:0169)
0:2505(0:0253)
0:5163(0:0332)
MRK 0:2612(0:0395)
0:3504(0:0592)
0:5687(0:0742)
0:2063(0:0258)
0:2535(0:0387)
0:5034(0:0514)
0:1540(0:0169)
0:1930(0:0253)
0:4599(0:0352)
MSFT 0:3421(0:0395)
0:4756(0:0592)
0:8192(0:0636)
0:2908(0:0258)
0:3507(0:0387)
0:6156(0:0467)
0:2023(0:0169)
0:2883(0:0253)
0:6223(0:0304)
PFE 0:3354(0:0395)
0:3740(0:0592)
0:6473(0:0700)
0:2407(0:0258)
0:3168(0:0387)
0:6237(0:0465)
0:1644(0:0169)
0:2407(0:0253)
0:6324(0:0302)
PG 0:3262(0:0395)
0:4378(0:0592)
0:7656(0:0653)
0:2274(0:0258)
0:3433(0:0387)
0:7514(0:0429)
0:1944(0:0169)
0:2435(0:0253)
0:5525(0:0321)
SBC 0:3411(0:0395)
0:4017(0:0592)
0:7310(0:0665)
0:2692(0:0258)
0:3410(0:0387)
0:5545(0:0491)
0:1866(0:0169)
0:2581(0:0253)
0:5784(0:0315)
UTX 0:3435(0:0395)
0:4700(0:0592)
0:6515(0:0698)
0:2413(0:0258)
0:3426(0:0387)
0:6751(0:0449)
0:1650(0:0169)
0:2417(0:0253)
0:6892(0:0291)
VZ 0:3317(0:0395)
0:4262(0:0592)
0:8357(0:0631)
0:2578(0:0258)
0:3458(0:0387)
0:6661(0:0452)
0:1866(0:0169)
0:2642(0:0253)
0:6138(0:0306)
WMT 0:3728(0:0395)
0:4847(0:0592)
0:7582(0:0655)
0:2570(0:0258)
0:3477(0:0387)
0:7860(0:0422)
0:1668(0:0169)
0:2573(0:0253)
0:8247(0:0271)
XOM 0:2498(0:0395)
0:3889(0:0592)
0:6534(0:0697)
0:2271(0:0258)
0:2390(0:0387)
0:4419(0:0550)
0:1507(0:0169)
0:2254(0:0253)
0:5013(0:0337)
Note: Asymptotic standard errors in parentheses.
109
Table 11: LPWN estimation of long memory in volatility of DJIA stocksm =
�n0:6
�m =
�n0:7
�m =
�n0:8
�Ticker Symbol (1; 0) (0; 1) (1; 1) (1; 0) (0; 1) (1; 1) (1; 0) (0; 1) (1; 1)
AA 0:5427(0:1139)
0:5525(0:1538)
0:5448(0:2315)
0:5196(0:0759)
0:5394(0:1012)
0:5247(0:1528)
0:5869(0:0469)
0:6027(0:0649)
0:5115(0:1007)
AIG 0:5872(0:1097)
0:6785(0:1487)
0:5887(0:2279)
0:6156(0:0701)
0:6068(0:0990)
0:6107(0:1483)
0:5833(0:0470)
0:6006(0:0650)
0:6209(0:0969)
AXP 0:9538(0:0903)
0:8216(0:1465)
0:9612(0:2198)
0:9719(0:0586)
0:6755(0:0975)
0:8239(0:1441)
0:5877(0:0469)
0:6057(0:0649)
0:7991(0:0946)
BA 0:5071(0:1177)
0:5606(0:1534)
0:5096(0:2351)
0:7064(0:0661)
0:6994(0:0971)
0:7474(0:1448)
0:6396(0:0451)
0:5770(0:0654)
0:7329(0:0951)
C 0:8962(0:0923)
0:8305(0:1465)
0:8982(0:2195)
0:7761(0:0636)
0:7036(0:0970)
0:7396(0:1449)
0:6577(0:0446)
0:6307(0:0645)
0:8016(0:0946)
CAT 0:6275(0:1065)
0:6235(0:1504)
0:6073(0:2266)
0:3601(0:0924)
0:3838(0:1118)
0:6125(0:1482)
0:4861(0:0514)
0:5134(0:0671)
0:4698(0:1030)
DD 0:4049(0:1325)
0:4667(0:1604)
0:4548(0:2424)
0:4507(0:0816)
0:4513(0:1060)
0:4495(0:1592)
0:4479(0:0536)
0:4397(0:0700)
0:3991(0:1084)
DIS 0:8803(0:0929)
0:9059(0:1463)
0:8788(0:2195)
0:9092(0:0600)
0:9898(0:0963)
0:9131(0:1441)
0:7966(0:0412)
0:8451(0:0630)
0:7870(0:0947)
GE 0:7370(0:0995)
0:7521(0:1472)
0:6984(0:2223)
0:7141(0:0658)
0:7467(0:0965)
0:7184(0:1453)
0:9888(0:0381)
0:7671(0:0632)
0:7187(0:0952)
GM 0:3761(0:1381)
0:4049(0:1677)
0:3799(0:2574)
0:4737(0:0796)
0:4881(0:1037)
0:4790(0:1564)
0:4820(0:0516)
0:4459(0:0697)
0:5354(0:0997)
HD 0:7040(0:1013)
0:6777(0:1487)
0:7078(0:2220)
0:6591(0:0681)
0:6800(0:0974)
0:6633(0:1465)
0:9888(0:0381)
0:7769(0:0631)
0:7015(0:0954)
HON 0:9893(0:0892)
0:9898(0:1467)
0:9890(0:2201)
0:7424(0:0648)
0:9854(0:0963)
0:9841(0:1445)
0:9418(0:0388)
0:5515(0:0660)
0:7446(0:0950)
HPQ 0:8452(0:0943)
0:8583(0:1464)
0:9214(0:2196)
0:9882(0:0583)
0:8807(0:0960)
0:9890(0:1445)
0:8530(0:0402)
0:9114(0:0630)
0:9206(0:0945)
IBM 0:7089(0:1011)
0:7619(0:1471)
0:7099(0:2219)
0:7851(0:0633)
0:7875(0:0962)
0:8044(0:1442)
0:7361(0:0425)
0:6765(0:0639)
0:7777(0:0947)
INTC 0:7274(0:1000)
0:7428(0:1473)
0:7266(0:2215)
0:7571(0:0643)
0:7736(0:0963)
0:7606(0:1447)
0:9879(0:0381)
0:7136(0:0635)
0:7686(0:0948)
JNJ 0:8592(0:0937)
0:8002(0:1467)
0:8639(0:2195)
0:6528(0:0683)
0:6438(0:0981)
0:7825(0:1444)
0:6879(0:0437)
0:6691(0:0639)
0:6793(0:0958)
JPM 0:4847(0:1204)
0:5400(0:1546)
0:4904(0:2374)
0:7293(0:0652)
0:6681(0:0976)
0:7001(0:1456)
0:6627(0:0444)
0:6469(0:0642)
0:6157(0:0971)
KO 0:9505(0:0904)
0:8090(0:1466)
0:9498(0:2198)
0:8450(0:0616)
0:8158(0:0961)
0:8256(0:1441)
0:9896(0:0381)
0:8388(0:0630)
0:7714(0:0948)
MCD 0:3414(0:1461)
0:4715(0:1599)
0:3472(0:2665)
0:6579(0:0681)
0:6890(0:0972)
0:6678(0:1464)
0:9896(0:0381)
0:7150(0:0635)
0:6721(0:0959)
MMM 0:9874(0:0893)
0:9893(0:1467)
0:9879(0:2201)
0:9880(0:0583)
0:9008(0:0960)
0:9894(0:1445)
0:9845(0:0382)
0:9869(0:0632)
0:9647(0:0947)
MO 0:9526(0:0904)
0:7394(0:1474)
0:9335(0:2197)
0:9565(0:0589)
0:6506(0:0979)
0:7847(0:1444)
0:5946(0:0466)
0:5469(0:0661)
0:7824(0:0947)
MRK 0:5558(0:1126)
0:5643(0:1532)
0:5546(0:2306)
0:6002(0:0709)
0:5832(0:0997)
0:5917(0:1491)
0:5732(0:0474)
0:5396(0:0663)
0:5713(0:0983)
MSFT 0:8010(0:0963)
0:8184(0:1465)
0:8003(0:2200)
0:9278(0:0596)
0:7643(0:0964)
0:8381(0:1441)
0:6497(0:0448)
0:6093(0:0648)
0:7937(0:0946)
PFE 0:8072(0:0960)
0:9896(0:1467)
0:8145(0:2199)
0:7372(0:0649)
0:6778(0:0974)
0:7131(0:1454)
0:6184(0:0458)
0:6149(0:0647)
0:7040(0:0954)
PG 0:9260(0:0913)
0:7646(0:1470)
0:9320(0:2196)
0:6881(0:0668)
0:7028(0:0970)
0:6932(0:1458)
0:9086(0:0393)
0:6644(0:0640)
0:7960(0:0946)
SBC 0:9711(0:0898)
0:8535(0:1464)
0:9734(0:2200)
0:9480(0:0591)
0:6398(0:0982)
0:8381(0:1441)
0:6196(0:0458)
0:5861(0:0652)
0:7954(0:0946)
UTX 0:5862(0:1098)
0:5983(0:1515)
0:5865(0:2281)
0:6373(0:0691)
0:6438(0:0981)
0:6395(0:1473)
0:6574(0:0446)
0:6684(0:0640)
0:6621(0:0961)
VZ 0:9228(0:0914)
0:8350(0:1464)
0:8787(0:2195)
0:8688(0:0610)
0:7483(0:0965)
0:8166(0:1442)
0:7044(0:0433)
0:6157(0:0647)
0:7801(0:0947)
WMT 0:8249(0:0952)
0:7575(0:1471)
0:8242(0:2198)
0:8152(0:0624)
0:8149(0:0961)
0:8149(0:1442)
0:8994(0:0394)
0:7736(0:0631)
0:9865(0:0948)
XOM 0:4872(0:1201)
0:6519(0:1494)
0:4900(0:2374)
0:8676(0:0610)
0:6669(0:0976)
0:7009(0:1456)
0:4411(0:0540)
0:4415(0:0699)
0:6900(0:0956)
Note: The heading �(Ry; Rw)�indicates the LPWN(Ry; Rw) estimator. Asymptotic standard errorsin parentheses.
110
Table 12: Parametric Whittle estimation of long memory in volatility of DJIA stocksTicker Symbol d �y �y �2� �x �x �2"
AA 0:5777(0:0943)
�0:5431(0:1864)
� 0:1634(0:1122)
0:0362(0:0206)
� 3:3266(0:1360)
AIG 0:6377(0:0748)
�0:7946(0:1447)
� 0:2771(0:1629)
�0:8357(0:0925)
0:9185(0:3044)
2:8112(0:7084)
AXP 0:5915(0:0695)
� �0:7416(0:0863)
1:9245(0:2775)
0:2959(0:1167)
� 1:2799(0:3486)
BA 0:5532(0:1019)
�0:9275(0:2558)
0:6585(0:2779)
0:1152(0:0844)
� 0:0584(0:0215)
3:2629(0:1125)
C 0:6201(0:0671)
� � 0:0913(0:0482)
� � 3:1170(0:0866)
CAT 0:4444(0:1982)
� � 0:2211(0:4432)
0:6814(0:1389)
�0:7236(0:1823)
3:2299(0:5543)
DD 0:2478(0:0978)
�0:7859(0:3492)
0:8867(0:3360)
0:7682(0:8560)
� � 4:1923(0:7411)
DIS 0:7555(0:1331)
� � 0:0125(0:0161)
� 0:0604(0:0160)
3:3340(0:0758)
GE 0:7509(0:1177)
0:6409(0:2936)
�0:8718(0:0901)
0:1393(0:1774)
� � 3:1386(0:1740)
GM 0:5040(0:1438)
�0:9730(0:0440)
0:9371(0:1043)
0:1624(0:2038)
0:6010(0:2992)
�0:5739(0:2931)
3:2918(0:2419)
HD 0:6211(0:1200)
0:4356(0:0676)
�0:8670(0:0453)
1:5042(0:5819)
� � 1:8557(0:5603)
HON 0:4166(0:0672)
�0:3490(0:3161)
� 0:6309(0:4581)
�0:6202(0:3682)
0:6475(0:3093)
2:7333(0:4812)
HPQ 0:9298(0:1684)
�0:9010(0:1339)
� 0:0085(0:0133)
0:6844(0:1604)
�0:6567(0:1633)
3:3724(0:0762)
IBM 0:6775(0:0978)
�0:6652(0:1972)
� 0:1063(0:0743)
0:0325(0:0196)
� 3:1645(0:1037)
INTC 0:7168(0:0865)
�0:8816(0:1020)
� 0:0712(0:0557)
�0:9341(0:0319)
0:9686(0:0836)
3:0632(0:3243)
JNJ 0:5824(0:1038)
0:3924(0:0799)
�0:7923(0:0630)
0:9809(0:7210)
� � 2:4001(0:6923)
JPM 0:5798(0:0624)
� � 0:1461(0:0688)
� � 3:1921(0:1005)
KO 0:8234(0:1214)
�0:7461(0:2712)
� 0:0249(0:0267)
� 0:0423(0:0166)
3:2834(0:0802)
MCD 0:6211(0:1290)
0:5379(0:1005)
�0:8949(0:0414)
0:8105(0:4203)
� � 2:6296(0:4025)
MMM 0:7032(0:1748)
� � 0:0128(0:0236)
� � 3:5654(0:0871)
MO 0:5410(0:0760)
0:6217(0:3652)
� 0:0185(0:0339)
� 0:0349(0:0194)
3:2044(0:0876)
MRK 0:4903(0:0764)
� � 0:1430(0:0873)
� � 3:2380(0:1142)
MSFT 0:5987(0:0725)
�0:7656(0:1451)
� 0:2990(0:1846)
�0:8105(0:0887)
0:8940(0:2491)
2:8675(0:6008)
PFE 0:6093(0:0850)
� � 0:0777(0:0428)
� � 3:3014(0:0872)
PG 0:5724(0:0741)
� � 0:0901(0:0565)
� � 3:1889(0:0942)
SBC 0:5518(0:0657)
� � 0:1294(0:0681)
� � 3:2578(0:1018)
UTX 0:6159(0:1041)
0:5142(0:0943)
�0:8598(0:0475)
0:8469(0:4559)
� � 2:5056(0:4314)
VZ 0:6778(0:1022)
�0:5928(0:3156)
� 0:1039(0:0747)
� 0:0714(0:0240)
3:2410(0:1074)
WMT 0:8328(0:1229)
� � 0:0078(0:0087)
� 0:0363(0:0154)
3:4027(0:0733)
XOM 0:4962(0:0698)
� � 0:1532(0:0839)
� � 3:2217(0:1109)
Note: Asymptotic standard errors (evaluated as the inverse of the negative Hessian) in parentheses.
111
Chapter Chapter Chapter Chapter 4444
Long-run dependencies in return volatility and trading volume
Long-run dependencies in return volatility and trading volume�
Frank S. Nielseny
University of Aarhus and CREATES
March 20, 2009
Abstract
By using the log-periodogram estimator of Geweke & Porter-Hudak (1983), Bollerslev & Jubin-
ski (1999) show that volatility and trading volume of the S&P100 common stocks have a similar
degree of fractional integration. This evidence of pairwise correspondence between estimates of
long memory across the volatility-volume series supports a long-run view of the Mixture of Distri-
butions Hypothesis; i.e. both processes are driven by a slowly mean-reverting fractional integrated
latent information process. Instead of using the log-periodogram estimator, which is biased, in the
context of perturbed fractional processes (e.g. long memory stochastic volatility models), we use
a semiparametric estimator that is robust to time series perturbed by a noise term which may be
serially correlated. We �nd evidence that volatility in general displays a higher degree of long mem-
ory than trading volume for the S&P100 common stocks. Additionally, volatility displays evidence
of being governed by a long memory stochastic volatility model, whereas this is not generally the
case for trading volume. Furthermore, we �nd little evidence of there being a cointegrating relation
between volatility and trading volume.
Keywords: Fractional integration, local Whittle estimation, long memory, mixture of distri-
butions hypothesis, perturbed fractional process, return volatility, stochastic volatility, trading
volume.
JEL Classi�cation: C22
�I am grateful to Tim Bollerslev, Jörg Breitung, Niels Haldrup, and Morten Ørregaard Nielsen for valuable suggestions
and comments. I would also like to thank Birgitte Højklint Nielsen for proofreading the manuscript. I greatly acknowledge
�nancial support from the Danish Social Sciences Research Council (grant no. FSE275-05-0199) and Center for Research
in Econometric Analysis of Time Series (CREATES), funded by the Danish National Research Foundation.yCREATES, School of Economics and Management, University of Aarhus, Building 1322, DK-8000 Aarhus C, Den-
mark, email: [email protected].
115
1 Introduction
In this paper, we are interested in characterizing the long-run joint volatility-volume relationship in
the context of the Mixture of Distributions Hypothesis (MDH), set forward by Clark (1973), Epps &
Epps (1976), and Tauchen & Pitts (1983). MDH asserts that returns and trading volume are jointly
dependent on the same underlying latent information arrival process. As we focus on the long-run
relationship, we will use semiparametric techniques as these give a convenient way of avoiding the
di¢ culties of modeling the short-run dynamics. In line with other speci�cations that have been used
to analyze volatility, e.g. autoregressive conditionally heteroskedastic (ARCH) and generalized ARCH
(GARCH) type models originally introduced by Engle (1982) and Bollerslev (1986), stochastic volatil-
ity (SV) models proposed e.g. by Taylor (1994), and the long memory SV (LMSV) set forward by
Breidt, Crato & de Lima (1998) and Harvey (1998), these models only give us little or no information
about the economic sources of the persistence. Structural models which build on theoretical market
microstructure models where asymmetrical information-rational agents strategically interact can tell
us about these potential sources by linking volatility and trading volume and thereby obtaining in-
formation regarding the subordinated process driving the volatility process. But in general we do not
learn about the long-run relationship between volatility and trading volume. To these subordination
model approaches belong: Tauchen & Pitts (1983) which extend Clark (1973)�s subordination model
within the Grossman & Stiglitz (1980) noisy rational expectations framework. Further, Lamoureux &
Lastrapes (1994) extend Tauchen & Pitts (1983) bivariate mixture model by assuming a dependent
directing process and they �nd no evidence that volatility persistence is driven by trading volume. An-
dersen (1996), building upon the Glosten & Milgrom (1985) market microstructure model, de�nes the
modi�ed MDH with Poisson distributed trading volume that explicitly allows for non-informational
trading and common information arrivals. This modi�ed MDH is empirically shown to �t better than
the classical MDH, but the results remain diverse in the sense that Liesenfeld (1998) tests Andersen
(1996)�s model assuming a dependent directing process, and he �nds no evidence that volatility per-
sistence is driven by trading volume. More recently, Liesenfeld (2001) and Watanabe (2000, 2003)
showed that maintaining the assumption that the joint distribution of the volume and volatility is a
mixture of normals is still valid when incorporating more than one type of information arrival process
having di¤erent implications for the volume and volatility persistence. Bollerslev & Jubinski (1999)
contribute to the literature by looking at the long-run relationship using semiparametric techniques
and thereby not having to specify the short-run dynamics governing the latent informational process
driving the volatility and trading volume. The potential long memory feature of the directing process
may be a factor in the rejection of the earlier models. Bollerslev & Jubinski (1999) �nd that volatility
and volume have a similar degree of memory (0.41 and 0.40, respectively). Furthermore, for only
8 of the 100 �rms in the S&P100 index can they reject that volatility and volume share a common
fractional integration order. This leads Bollerslev & Jubinski (1999) to argue in favor of a long-run
view of the MDH in which the aggregate latent information arrival process displays long memory
characteristics.
We contribute to the literature in several ways. As opposed to Bollerslev & Jubinski (1999)
who use the log-periodogram (LP) estimator, we adopt the local Whittle (LW) setting. Robinson
116
(1995a) showed that the standard LW estimator dominates the LP estimator in several ways. It is
asymptotically more e¢ cient and it does not assume Gaussianity. Both the LP and LW estimators
are semiparametric using the approximation
fz � G��2d as �! 0+; (1)
where G is a constant and ���means that the ratio of the left and right hand sides tends to onein the limit. Thus, the estimator is robust against short-run dynamics since the estimator only uses
information from the periodogram ordinates in the vicinity of the origin. But as shown by e.g. Andrews
& Guggenberger (2003) and Andrews & Sun (2004), if the signal is contaminated by persistent short-
run components the estimate of d will be biased. We can reduce the asymptotic bias by modeling the
spectral density of the short-run component by a polynomial instead of a constant in the vicinity of
the origin. Additionally, if the process that we try to model is a perturbed fractional process
zt = yt + wt; (2)
where the signal yt is a long memory process which is perturbed by the additive term wt (e.g. short
memory measurement error), we potentially introduce bias into the estimate of the long memory
parameter as we do not take the perturbation into account. The validity of the LP and LW estimators
are to some extent diminished when the series of interest follows e.g. a LMSV model as the rate of
convergence can be considerable slower than in a pure long memory setting. Although the LP and LW
estimators preserve consistency and asymptotic normality when applied to the LMSV model several
paper, e.g. Deo & Hurvich (2001), Arteche (2004b), and Frederiksen, Nielsen & Nielsen (2008) show
both theoretically and via simulations that they are heavily biased in that case.
Further, motivation for the perturbed process is the version of the LMSV model for �nancial
returns indirectly proposed by Bollerslev & Jubinski (1999)
rt = �peyt+xtut; (3)
where rt denotes the return, yt the long memory component of the log-volatility of the returns, xt is
a short-memory process, and yt; xt; and ut are independent to satisfy the requirement that E(rt) = 0:
This is an extension of the LMSV model of Breidt et al. (1998) and Harvey (1998)
rt = �peytut: (4)
It is clear from the formulation in (3) that we allow for di¤erent short-lived news impacts while
imposing a common long memory component. Therefore, the volatility is allowed to be a¤ected by
both long- and short-lived news impacts, which is consistent with the �ndings of Liesenfeld (2001).
This may provide a better characterization of the joint volatility-volume relationship in the context
of the MDH.
By considering the perturbed fractional process (2), i.e. signal-plus-noise model, the LMSV models
(3) and (4) imply that the log squared returns become a long memory signal-plus-noise process as given
by (2) where yt is the long memory component (log-volatility part) and wt is the short-memory part
117
(additive noise term) of the original series zt. If we assume that the log-volatility process fytg and thenoise process fwtg are independent, the spectral density of zt can be written as
fz = ��2d�y (�) + �w (�) = ��2dG
��y (�)
�y (0)+ �2d
�w (�)
�y (0)
�; (5)
where fy = G��2d�y (�) is the spectrum of the signal, �w (�) is the spectrum of the noise term wt,
and d is the degree of long memory in yt (and zt).1
By using a semiparametric estimator that allows both the spectrum of the short-memory compo-
nent of the signal and the spectrum of the perturbation, i.e. �y (�) and �w (�), to be approximated
by polynomials of �nite and even orders near the zero frequency, instead of constants, we obtain a
bias reduction depending on the smoothness of �y (�) and �w (�) near the origin. The estimator
employed is the local polynomial Whittle with noise (LPWN) derived by Frederiksen et al. (2008),
which explicitly models the spectrum of the short-memory component of the signal and the spectrum
of the perturbation. The LPWN estimator allows both the spectrum of the perturbation and the
spectrum of the short-memory component of the signal to be approximated by polynomials near the
zero frequency, and therefore mitigating this bias. In this paper, we measure return volatility based
on daily returns. It is clear that this measure is more noisy than using realized volatility based on
intraday returns as our volatility measure, see e.g. Fleming & Kirby (2006). But using semiparametric
methods, i.e. the LPWN estimator, that mitigate this noise, see eqn. (2) and (3), we appropriately
deal with this.
In addition to estimating the univariate long-run persistence of volatility and trading volume, we
also employ a test for long memory vs. spurious long memory, where we use the notion of sample
splitting (and temporal aggregation), and test whether there is parameter constancy. Regarding the
issue of sample splitting we extend the results of Shimotsu (2006) to include robustness to perturbed
fractional processes. Exploiting the long memory parameter�s invariance to temporal aggregation,
Frederiksen & Nielsen (2008) derives the joint distributional properties for di¤erent semiparametric
estimators for fractionally integrated processes potentially perturbed and non-stationary, applied to
di¤erent aggregation levels of the original series, and introduced a formal test of true long memory.
We further model return volatility and trading volume in the context of fractional cointegration.
The setup used is the semiparametric cointegration rank determination procedure of Robinson &
Yajima (2002) and Nielsen & Shimotsu (2007). Nielsen & Shimotsu (2007) extends the work of
Robinson & Yajima (2002) as it accommodates both (asymptotically) stationary and nonstationary
fractionally integrated processes and cointegrating errors by applying the exact local Whittle (ELW)
setting of Shimotsu & Phillips (2005). In the analysis of the joint volume-volatility relationship, we
examine the rank of the spectral density matrix around the origin, where the fractional integration
order, d, is estimated by the LW estimator. Using the LW setup, we are subject to the problem as
other semiparametric estimators that do not model the potential perturbation in the noise. That is,
1By using the independence assumption between yt and wt, we exclude the so-called leverage e¤ect. The independence
assumption is common in the literature concerning LMSV, see Breidt et al. (1998), Deo & Hurvich (2001) and Arteche
(2004b) among others. This independence assumption could be relaxed; see Frederiksen et al. (2008) for a discussion of
the possibilities in that direction.
118
we are subject to size distortions in the rank test as we do not take potential bias of the LW estimator
into account.
Our results show that using semiparametric estimators that are robust to potential contamination
by an additive noise term, there is evidence that volatility and trading volume are more persistent in
terms of memory than other studies have shown. Additionally, volatility displays evidence of being
governed by a perturbed fractional process, more speci�cally a LMSV model, whereas this is not
generally the case for trading volume. In the cases where we cannot reject that the memory parameters
are equal, we furthermore �nd weak evidence of there being a cointegrating relation between volatility
and trading volume. This is in line with other studies showing that although volatility and volume
might share a common fractional integration order, they do not move together over time.
The remainder of the paper is organized as follows. Next we discuss the volatility and trading
volume in the context of the MDH and how long memory naturally relates to this subject. Section 3
introduces the data that is used in the analysis. In section 4, we set up the methodology that is used
in the analysis of volatility and trading volume, and section 5 presents the empirical results. Section
6 concludes.
2 Volatility, trading volume, and the Mixture of Distributions Hy-pothesis
In this section, we will give a short overview of the Mixture of Distributions Hypothesis (MDH)
literature. Tauchen & Pitts (1983) analyzed a bivariate mixture model (standard mixture model)
where the movement from one equilibrium to another is motivated by the arrival of new information to
the market. At the ith equilibrium the jth trader�s desired position in this security is q�ij = ��P �ij � Pi
�where P �ij is the traders reservation price, Pi is the current market price and � is a positive constant.
In equilibrium the market clears and the market price is (1=J)PJ
j=1 P�ij . Hence, the market price only
changes whenever new information arrives and the return is given by
Ri � dPi = (1=J)JXj=1
dP �ij ; (6)
where it is assumed that the change in traders reservation prices follow
dP �ij = �i + ij ; �i � i:i:d:N�0; �2�
�and ij � i:i:d:N
�0; �2
�: (7)
Here �i and ij represent a common (market wide) and a idiosyncratic component speci�c to the jth
trader, respectively. The parameters �2� and �2 measure the sensitivity of traders�reservations prices
with respect to new information.
Assuming that the number of traders J (assumed to be large) and the variances �2� and �2 are
constant over time, the joint distribution of the daily return, Rt, and volume, Vt, are mutually and
serially independent bivariate normal conditional on the number of information arrivals
Rt jIt � N�0; �2riIt
�; (8)
Vt jIt � N��0 + �viIt; �
2viIt�; (9)
119
where It is the daily number of information arrivals, Rt and Vt are given by the sum of the intra-
daily returns, Ri, and trading volumes, Vi, respectively, and �0 captures the part of trading volume
generated by liquidity traders (noise component that is non-in�uenced by the information �ow). As
seen from (8) and (9) the dynamics of the volatility of returns and the dynamics of trading volume
are dependent on the time-series behavior of It. Hence, if we allow for strong serial correlation in Itthe model predicts persistence and clustering in return volatility, which is empirically well founded,
see e.g. Engle (1982), Engle & Bollerslev (1986), Bollerslev (1986). Since unanticipated news often
tends to be followed by several explanations related to the initial news, it is sensible to assume some
degree of persistence in the information arrival process, and if the news impact even last for a random
number of subsequent days it follows from Parke (1996) (see Bollerslev & Jubinski (1999)) that the
resulting latent aggregated information arrival process will be fractionally integrated.
Several extension to the standard MDH have been proposed. In contrast to the Tauchen & Pitts
(1983) model, where all traders act unstrategically, Andersen (1996) showed that by modeling the
strategic interaction of informed investors and liquidity traders with a risk-neutral market maker,
i.e. the setting of Glosten & Milgrom (1985), the daily trading volume conditional on some latent
distribution process It is approximated by the Poisson distribution, i.e.
Vt jIt � Po (m0 +m1It) (10)
where m0 captures the part of trading volume generated by liquidity traders (noise component),
whereas m1 is the component related to the information �ow governed by the latent distribution
process It. By deriving the trading volume conditional on the information �ow, the nonnegative
constraint is always ensured, and by allowing for a nonzero constant m0, the parameterization is more
�exible, and therefore a nonproportional relationship is allowed for. Andersen (1996) showed that his
modi�ed mixture model is superior to the standard mixture model of Tauchen & Pitts (1983), but the
bivariate system implied a signi�cantly lower volatility persistence than empirically observed. This led
to the conclusion that more than one type of information arrival process having di¤erent implications
for the volume and return volatility persistence should be incorporated.
This problem is to some extend alleviated in the generalized mixture model of Liesenfeld (2001)
where he allowed for time-dependencies in the variances �2� and �2 in the Tauchen & Pitts (1983)
model, see (7), which are directed by a common latent random process, !t. Furthermore, as !t is
very persistent the generalized mixture model alleviates the shortcomings of the modi�ed model by
Andersen (1996). However, as the volume speci�cation is very close to that of the standard model, it
is not odd that the generalized mixture model does not alleviate the autocorrelation in the (squared)
standardized volume residuals found in the standard MDH.
A similar modi�cation is proposed by Watanabe (2000, 2003) where the volume dynamics are as
the standard MDH, setting �0 = 0, and the return dynamics are in�uenced by two latent variables, Itand Jt,
Rt jIt; Jt � N�0; �2riItJt
�; (11)
where log (It) and log (Jt) both follow a zero-mean Gaussian AR (1)-process. Similar to Liesenfeld
(2001), Watanabe (2000, 2003) �nds that the latent variable, that only in�uences the volatility, is
120
more persistent than the common latent factor but less volatile. However, the model su¤ers the same
drawbacks as the generalized mixture model of Liesenfeld (2001) regarding the volume speci�cation.2
Overall, and as noted by Bollerslev & Jubinski (1999), by forcing the same short-run dependence for
the two series, the long-run persistence may be underestimated. As we use semiparametric techniques
and therefore do not model the short-run dynamics, we conveniently disregard this discussion. In
this paper, we are particularly interested in the persistence and the long-run relationship between
volatility and trading volume. This is motivated by the arguments given in Andersen (1996), Andersen
& Bollerslev (1997a, 1997b), Bollerslev & Jubinski (1999), among others, i.e. a long-run proposition
for MDH where we have the following long-run link for large �
corr (jRtj ; jRt�� j) � �2d�1; (12)
corr (Vt; Vt�� ) � �2d�1;
corr (jRtj ; jRt�� j) � corr (Vt; Vt�� ) � �2d�1;
given some distributional assumptions.3
3 Data
We consider common stock return volatility and trading volume of the S&P 100 shares included
in the index on December 1, 2006. Share price, trading volume, number of outstanding shares, and
adjustment coe¢ cients to adjust share prices for dividends and splits were downloaded from the Center
for Research in Security Prices (CRSP) database. The sample period runs from July 2, 1962 through
December 31, 2006. Following Bollerslev & Jubinski (1999), among others, we remove the three weeks
following the October 1987 crash.4 We will only consider the 45 shares where the full sample of daily
observations is observed, i.e. n = 11; 187.
Returns (for the jth common stock) are measured in daily continuously compounded rates, Rjt =
log (Pjt=Pj;t�1) and were corrected for the e¤ects of stock splits and dividends using the correction
factor in the CRSP database. To avoid the problem of taking logarithm of zero, we based the analysis
on adjusted log-squared/absolute returns using the method of Fuller (1996), i.e. we analyze
~r2jt = log~R2jt = log
�R2jt + �
�� �
R2jt + �;
2 In on going work Frederiksen & Nielsen (2009) use Maximum Likelihood (using E¢ cient Importance Sampling) to
estimate a three factor bivariate mixture model where the common information arrival process is assumed to posses long
memory according to the Gaussian ARFIMA (1; d; 0) process and the distinct dynamics of the volume and volatility
follow Gaussian AR (1)-processes. It is a necessity at least to apply two-factor bivariate models like Liesenfeld (2001) and
Watanabe (2000, 2003) as one-factor models do not alleviate the problem that the persistence parameter of the (common)
information arrival process is too small compared to univariate empirical �ndings, see Andersen (1996), Liesenfeld (2001)
and Watanabe (2000, 2003).3For more on the de�nition of fractional processes and the assumptions needed in linking this to the hyperbolic decay
of the autocorrelations, �2d�1, see e.g. Beran (1994) and Robinson (2003).4We did do the analysis on the full sample and also looked at further reduced samples, e.g. removing all observations
between December 23 and January 2, as done in Andersen (1996). The results were qualitatively similar.
121
where � = 0:02n
Pnt=1R
2jt and
j~rtj = log��� ~Rjt��� = log (jRjtj+ �)� �
jRjtj+ �;
� = 0:02n
Pnt=1 jRjtj. Furthermore, following Bollerslev & Jubinski (1999) the daily trading volume for
share j; Vjt is measured by the turnover ratio5
Vjt =SjtNjt
;
where Sjt and Njt are the common stock volume and total number of outstanding shares for common
stock j at time t, respectively. Instead of working with Vjt, we consider, vjt = log (Vjt). We need to deal
with the deterministic trends in the volume measure before moving on to the analysis. Firstly, trading
volume is an increasing function of time, at least until the mid of 1997, after which it stabilizes
somewhat. Secondly, there is signi�cantly lower trading between Christmas and New Year�s eve,
Andersen (1996).
By log-transforming the raw series, we approximate it to a straight line, and instead of employing
a linear detrending procedure as in Bollerslev & Jubinski (1999), we regressed the log transformed
turnover ratio, vjt, on a constant, time trend, squared time trend, and a dummy equal to 1 if t 2[Christmas; January 1]. Sensitivity analysis of using other detrending methods, i.e. linear detrending
(excluding the dummy variable) or including only up to a cubic polynomial, did not change the
conclusions in a qualitative sense.
Summary statistics of return data and the detrended log turnover ratio for the Minnesota Mining
and Manufacturing Co. (3M), International Business Machine Corporation (IBM), and Aluminum
Corporation of America (AA), are given in Tables 1 and 2, respectively. For the return data, the
mean is not statistically di¤erent from zero, and the sample standard deviation is quit larger than
the mean for all three shares, and they all exhibit excess kurtosis and non-normality. Looking at the
respective subperiods for the returns, the mean is again not signi�cantly di¤erent from zero, and the
sample standard deviation is considerably larger than the mean and quite stable over subperiods for
all three shares. There is considerable kurtosis and especially in subperiod 5 (October 1984 through
April 1990) for the shares 3M and AA. Furthermore, there is a high degree of own serial correlation for
the shares 3M and AA, whereas it is not so profound for the IBM share. Turning next to the detrended
log turnover ratio and comparing it to the return series, the excess kurtosis is not as profound even
in subperiod 5. What is apparent is the considerable amount of own serial correlation which is much
higher than for the return series.
Insert Tables 1-2 about here.
Looking at Figure 1 where we have plotted the autocorrelation function for the return, adjusted
log-squared return, adjusted log-absolute return and detrended log there is clearly more persistence in
the detrended log turnover ratio than for the adjusted log-squared and adjusted log-absolute returns.
This is consistent with the descriptive statistics in Tables 1 and 2.
Insert Figure 1 about here.5There are several other measures of trading activity, which might be of interest. For example, number of trades per
period, share volume, values of shares traded, relative dollar volume, and dollar turnover.
122
If we were to plot the cross-correlation between return volatility and our measure for trading
volume, we would see a signi�cant positive relationship which is consistent with Karpo¤ (1987),
Andersen (1996), among others.
4 Methodology
This section contains the methodology used in our empirical investigation. We �rst brie�y describe lo-
cal Whittle estimation of perturbed processes and the estimators proposed by Frederiksen et al. (2008)
which are especially suited for this setting. Secondly, we discuss the subject of long memory versus
spurious long memory, where we focus on a Wald test of parameter constancy derived by Shimotsu
(2006) in a sample splitting context and by Ohanissian, Russell & Tsay (2008) and Frederiksen &
Nielsen (2008) in a temporal aggregation context. Finally, we describe the semiparametric fractional
cointegration methodology derived in Robinson & Yajima (2002) and Nielsen & Shimotsu (2007),
which is used in analyzing the long-run relation between volatility and trading volume.
4.1 Local Whittle estimation of perturbed fractional processes
For perturbed fractional processes, we have the spectral representation (5) rather than (1). There
are two main consequences: �rst, the extra additive term in (5) needs to be taken into account to
avoid serious asymptotic bias as mentioned in the introduction, and second the rate of convergence
of the estimators is reduced if the extra term is not modeled. The latter follows because the choice of
bandwidth parameter is severely constrained for perturbed fractional processes when the perturbation
term in (5) is not modeled. Thus, for non-perturbed processes the bandwidth requirement is typically
m = o(n4=5), whereas for perturbed processes it is m = o(n2d=(1+2d)) (apart from logarithmic terms).
Since d < 1 and the estimator ispm�consistent, this is a serious constraint because of the further
restriction on the number of periodogram ordinates used in the estimation to yield consistent estimates
of the long memory parameter.
Frederiksen et al. (2008) propose to approximate (5) locally near the zero frequency by
g (�) = ��2dG�1 + hy(�y; �) + �
2dhw(�w; �)�; (13)
where hy(�y; �) =PRy
r=1 �y;r�2r, hw(�w; �) =
PRwr=0 �w;r�
2r. We note that �y(�) and �w(�) are sym-
metric around � = 0 and are therefore approximated by even polynomials. If Ry = 0 set hy(�y; �) = 0.
De�ning also the polynomial h(d; �; �) =hy(�y; �) + �2dhw(�w; �) with � = (�0y; �0w)0 this yields the
(concentrated) likelihood
Q (d; �) = log G (d; �) +1
m
mXj=1
log���2dj (1 + h(d; �; �j))
�; (14)
G (d; �) =1
m
mXj=1
�2dj Iz (�j)
1 + h(d; �; �j): (15)
Thus, minimize (14) over the admissible set D ��,
(d; �) = argmin(d;�)2D��
Q (d; �) ; (16)
123
where � is a compact and convex set in RR+1, R = Ry +Rw, and D = [d1; d2] with 0 < d1 < d2 < 1.
This estimator is called the local polynomial Whittle with noise (LPWN) estimator, and it is shown
in Frederiksen et al. (2008) that under some regularity conditions the estimator will be consistent for
d 2 (0; 1) and asymptotically normal for d 2 (0; 3=4).A few remarks are in order. Note that if h(�; �) = 0, we have the standard local Whittle speci�ca-
tion of Robinson (1995a), which does not explicitly account for the perturbation. For (Ry; Rw) = (0; 0)
we get h (�; �) = �, where �y(�) and �w(�) in (5) are both modeled locally by constants. This is the
local Whittle with noise (LWN) estimator of Hurvich & Ray (2003) and Hurvich, Moulines & Soulier
(2005). Thus, the parameterization in (14) includes the standard LW estimator and the LWN esti-
mator as special cases. Furthermore, the use of the polynomials hy(�y; �) and hw(�w; �) increases the
asymptotic variance of d by a multiplicative constant compared to the LW estimator, see Frederiksen
et al. (2008) for more details.
4.2 Testing for long memory versus structural breaks
Recent literature has dealt with the presence of structural breaks and regime switching in analyzing
long memory in data. Granger & Terasvirta (1999) show that the number of regime switches a¤ects
the long memory parameter. Diebold & Inoue (2001), Granger & Hyung (2004) and Haldrup &
Nielsen (2007) discuss that if series display breaks, particularly in their deterministic components,
these processes will give the impression of persistence. That is, we can mistakenly conclude that a
process displays long memory, where in fact it is due to a structural break in the series. Especially,
the empirical literature on long memory in volatility is vast and one can roughly divide them into two
opposites. One in favor of the long memory hypothesis, and the other against. To the �rst group
belong papers of Ding, Granger & Engle (1993), Ding & Granger (1996), Bollerslev & Mikkelsen
(1996), Andersen & Bollerslev (1997a, 1997b), Breidt et al. (1998), Lobato & Savin (1998), among
others. To the second group belong papers by Mikosch & Starica (2000) and Granger & Hyung (2004),
among others.
To investigate this further in our context we apply the method of sample splitting (and temporal
aggregation), and test whether or not we have parameter constancy.
4.2.1 Sample splitting
Following Shimotsu (2006), we are interested in testing the hypothesis of true I(d) versus spurious
I(d), i.e. H0 : d0 = d(1)0 = ::: = d
(K)0 where d(i)0 , i = 1; :::;K is the true long memory value of d from
the ith block and where we have split the sample into K 2 N+ blocks, so that the sample consistsof n
K 2 N+ observations in each block. De�ne a K + 1 vector d =�d1; d2; : : : dK
�0of long memory
estimates and a K � (K + 1) matrix A
A =
0BB@1 �1 � � � 0...
.... . .
...
1 0 � � � �1
1CCA :
124
Then we can set up the Wald statistic testing the null that the original series is a long memory process.
WS =�Ad�0 �
A�A0��1 �
Ad�
d! �2K�1; (17)
where var�d�= � is the �nite sample covariance matrix of the estimates. This test statistic includes
several semiparametric estimators, as we just need the estimate of the individual K samples, and the
limiting distributional result for the semiparametric estimator of interest. Shimotsu (2006) uses a test
statistic to cover non-stationary values of d, i.e. using the 2-step feasible exact local Whittle (ELW)
estimator6. We, on the other hand, want to focus on estimators that are speci�cally applicable to
potentially perturbed fractional processes, therefore we use the LPWN framework.
We further analyse potential spurious long memory in the context of temporal aggregation, where
we exploit the long memory parameter�s invariance to temporal aggregation. Ohanissian et al. (2008)
derived the joint distributional properties of the log-periodogram regression estimator of Geweke &
Porter-Hudak (1983) and Robinson (1995b) applied to di¤erent aggregation levels of the original series,
and introduced a formal test of true long memory. Furthermore, Ohanissian et al. (2008) extend
their test to long memory signal plus added noise models (perturbed fractional processes). However,
especially in small samples, it should be noted that the LPR estimator may be substantially downward
biased when the noise-to-signal ratio is high, and that the bias increases as the bandwidth m, used in
the estimator, increases, see Deo & Hurvich (2001, 2002). Therefore, we use the LPWN estimation
framework, where the limiting distributional results given di¤erent aggregation levels are derived in
Frederiksen & Nielsen (2008).
4.3 Fractional cointegration
We analyze potential fractional cointegration in the semiparametric setting of Robinson & Yajima
(2002) and Nielsen & Shimotsu (2007).7 As the theory of modeling the perturbation and short-run
dynamics by polynomials is only applicable in the univariate setting (at least using the theory set
forward by Frederiksen et al. (2008)), we will go forward as in Robinson & Yajima (2002). We
should note that Nielsen & Shimotsu (2007) extend the results of Robinson & Yajima (2002) to cover
nonstationary values of d by using the ELW estimator proposed by Shimotsu & Phillips (2005). The
limiting distribution will not change if we consider the LW estimator instead of the ELW as in Nielsen
& Shimotsu (2007) (for d 2 (�1=2; 1) and mean equal to zero when d � 1=2). We, however, need
to modify some of the Assumptions in Robinson & Yajima (2002); e.g. we need to account for non-
stationary values of d, i.e. d 2 [1=2; 1) in the Type I fractional framework, see e.g. Velasco (1999).Furthermore, G is estimated di¤erently under the ELW setting.
Robinson & Yajima (2002) generalize the univariate results of Robinson (1995a) to the multivariate
case where consistency and asymptotic normality is established under both the presence and absence
of cointegration.6The feasible ELW estimator extends the ELW estimator to the case with an unknown mean and trend. Furthermore,
the 2-step feasible ELW uses a tapered estimator by Velasco (1999) in the �rst step, and in the second step r Newton-
Rahpson type iterations, i.e. dj = dj�1�Hn
�dj�1
��1Sn�dj�1
�for j = 1; :::; r, where d0 is the estimate from step one,
and Hn (:) and Sn (:) are the Hessian and score of the ELW objective function, respectively.7 It would be of interest to consider alternative (parametric) techniques, e.g. Breitung & Hassler (2002) and Hassler &
Breitung (2006), but in this analysis we will restrict ourselves to semiparametric methods as discussed in the introduction.
125
The consistency results for�d; G (d)
�are then used to establish a joint test for pairwise equality of
the integration orders, H12 : d1 = d2, or if we have p series, the hypothesis of equality of all integration
orders, H0 : d1 = :::dp = d
T12 =m1=2
�d1 � d2
��12
�1� G212
G11G22
��1=2+ h (n)
; (18)
T0 = m�Sd�0�
S1
4K�1
�G � G
�K�1S0 + h (n) Ip�1
��1 �Sd�; (19)
where h (n) satis�es an assumption that restricts the bandwidth choice, see Robinson & Yajima (2002,
Assumption G) and Nielsen & Shimotsu (2007, Assumption 6), K = diag�G�; S = [Ip�1;��] and �
is the (p� 1) vector of ones. Then under the regularity conditions governing the LW estimator, h (n),
and under H12 and H0 as n!1
(i) If X1t and X2t are not cointegrated, T12d! N (0; 1) ; (20)
(ii) If X1t and X2t are cointegrated, T12p! 0;
(iii) If Xt is not cointegrated, T0d! �2p�1;
(iv) If Xt is cointegrated, T0p! 0:
For (iii) this corresponds to a situation where r = 0 and (iv) to a situation where r � 1.When we want to consider the cointegration rank of Xt, we need to estimate G and its eigenvalues.
We know that G is de�ned (for a new bandwidth choice m1, which is due to the potential problems
arising from the estimation of d) for the new bandwidth choice m1 as
G (d) =1
m1
m1Xj=1
Ren� (�j)
�1 I (�j) � (�j)�1�o; (21)
where � (�j) =diagne��d=2��dj ; :::; e��d=2��dj
o, � denotes the complex conjugate, and where we for
simplicity have assumed d1 = ::: = dp = d, I (�j) is the periodogram of (X1t; :::; Xpt)0, and Gi is the
ith column of G for i = 1; :::; p. As d is unknown we need to substitute this with an estimate, and as
discussed in Robinson & Yajima (2002) and Nielsen & Shimotsu (2007) we cannot use the multivariate
version of the LW estimator as Xt does not have full rank when Xt is cointegrated. Furthermore, d
needs to converge at a faster rate than m1=21 . Therefore, estimate each di for i = 1; :::; p by (16), where
h(d; �; �j) is set to zero in eqn. (14) and (15), using m periodogram ordinates, select m1 such thatmm1! 0 for n!1, and then estimate G by (21). De�ne �d = 1
p
Ppi=1 di: Then for the ith eigenvalue of
G and G��d�; i.e. �i and �i for i = 1; :::; p, respectively, where the eigenvalues are ordered descendingly.
From this, a model selection procedure can be set up to determine the cointegration rank, r: Following
Robinson & Yajima (2002), estimate r by
r = arg minu=0;:::;p�1
L (u) ; (22)
where
L (u) = v (n) (p� u)�p�uXi=1
�i; (23)
126
for some v (n) > 0 which is assumed to satisfy some convergence assumption as n!1, see Robinson& Yajima (2002, Assumption J) and Nielsen & Shimotsu (2007, Assumption 8�).
In this paper, p = 2 as we analyze potential cointegration between volatility and trading volume.
The outcome of the selection procedure is especially a¤ected by v(n). The Monte Carlo study in
Nielsen & Shimotsu (2007) reveals that for a large v (n) we are likely to estimate a large r, whereas
the opposite is the case when v (n) is small. Therefore, we let v (n) take on di¤erent values in the
empirical section. Furthermore, the test of equality of integration orders is sensitive to the choice of
h (n). Selecting too large a h (n) leads to a underrejection ofH12 (H0) under noncointegration, whereas
selecting too small a h (n) leads to overrejection of H12 (H0) under cointegration. Nielsen & Shimotsu
(2007) show that h (n) = log�1 n works well, but the test statistic overrejects when h (n) = log�2 n.
5 Results
This section deals with univariate estimation8 of the fractional integration parameter of volatility
and trading volume by using semiparametric methods, testing for true long memory versus spurious
long memory by sample splitting (and temporal aggregation), and testing whether or not they are
fractionally cointegrated.
5.1 Results for return volatility and trading volume
The �rst part of this subsection looks at the long-run persistence of volatility and trading volume
using the semiparametric estimators discussed in the methodology section. We will look at three
shares in detail, 3M, IBM, and AA, and discuss the remaining shares later in the section. We consider
the following estimators: LP, LW, LPW, LWN, and LPWN implemented with (Ry; Rw) equal to
(1; 0) ; (0; 1) ; and (1; 1), denoted LPWN(Ry; Rw) :
For all estimators, we set the bandwidth as m = bnac, where a 2 f0:5; 0:6; 0:7g. For the LP andLW estimators, we conjecture9 that a choice of a = 0:5 is the more appropriate of the three, whereas
for the LPWN (including the LWN) estimators, we apply the same bandwidth for simple comparison.
However, regarding the estimators where we model a polynomial term also, we can have issues of
pinning down the ��s estimates. Therefore, a larger sample of periodogram ordinates is needed, and
therefore the inclusion of a 2 f0:6; 0:7g. Numerical optimization was carried out in Matlab v7.5 usingthe BFGS optimization routine. The initial values were set as follows. For all estimators, the starting
values were set equal to the Geweke & Porter-Hudak (1983) LP estimate. The admissible space of d for
the Whittle estimators is restricted to be [0:01; 0:99], c.f. Assumption A2, Frederiksen et al. (2008).
If dLP =2 [0:01; 0:99], then initial values are set equal to 0:4. As initial values for the polynomial
8We do not consider multivariate semiparametric estimation of the fractional integration parameters as the theory of
modeling the perturbation and short-run dynamics by polynomials is only applicable in the univariate setting (at least
using the theory set forward by Frederiksen et al. (2008)). Furthermore, Nielsen (2008) shows in a comparative study of
fractional cointegration that when data is contaminated by noise,.the multivariate LW estimator of Shimotsu (2007) is
more biased than the univariate LW estimator.9 It is well-known that the bias of the LP and the LW estimator is increasing in the bandwidth when the long memory
series is perturbed and that choosing a = 0:5 renders fairly unbiased results, see e.g. Sun & Phillips (2003) and Arteche
(2004a, 2004b).
127
parameters we used 1, for all estimators. Furthermore, the polynomial terms for the LWN and LPWN
estimators are restricted to be non-negative.
Table 3 presents the memory estimates for the adjusted log-squared returns, adjusted log-absolute
returns, and detrended log turnover ratio. Firstly, looking at the volatility series for the three shares,
we see that as expected from theory10, the LP, LW, and LPW estimators appear downward biased and
are decreasing as a function of the bandwidth. For the LWN and LPWN estimators, this is not the
case. The fractional integration estimate is reasonably constant regardless of bandwidth. Furthermore,
the memory estimate is in the nonstationary region for all three common stocks, except for the 3M
share, when the bandwidth is set equal to m =�n0:5
�, where it is borderline nonstationary. In Table
3, we have marked the cases where the polynomial coe¢ cient is signi�cant, and we clearly see that the
noise coe¢ cient (marked with an b) is signi�cant in all cases for the LWN estimator, but interestingly
this is not the case for the LPWN estimators. Looking at the standard errors when including a noise
polynomial (the LPWN(0,1) and LPWN(1,1) cases), we observe that the standard error on the noise
coe¢ cient in some cases increases considerably, which is evidence of collinearity. Therefore, including
a polynomial in the characterization of the short-run contamination in the perturbation is not needed
in most of the cases. Additionally, comparing the two volatility series, we see that d is estimated
higher under the absolute measure.
Turning our attention to the detrended log turnover ratio, d again falls as a function of bandwidth
for 3M and IBM for the LP, LW, and LPW estimators, whereas this is not the case for the AA share.
LWN and LPWN estimators are in the nonstationary region for the 3M and IBM shares. For the AA
common stock, all estimators estimate d reasonably close to each other and in the stationary region.
That is, for the AA share there is no sign that it is governed by a perturbed fractional process. There
is also evidence of this as it is only when m =�n0:6
�that the noise coe¢ cient is signi�cant for the
LWN estimator.
Table 4 presents the mean and median for the 45 estimated fractional integration parameters.
The LWN and LPWN estimators are in general considerably higher than for the LP, LW, and LPW
estimators, especially as the bandwidth increases, and in the non-stationary region for the two volatility
measures. Furthermore, the memory estimates for the adjusted log-absolute returns are higher than
for the adjusted log-squared returns for the LWN and LPWN estimators, whereas the conclusions are
reversed for the LP, LW and LPW estimators. For the detrended log turnover ratio, the LWN and
LPWN estimates are borderline nonstationary in most cases, and the volatility measures are clearly
more persistent in terms of memory than trading volume. As the memory estimate for the detrended
log turnover ratio across estimators and bandwidth is reasonably similar, it does not look as if trading
volume in general follows a perturbed fractional process. Bollerslev & Jubinski (1999) �nd a mean and
a median of 0:404 and 0:407, respectively, for the squared returns, and for their measure of trading
volume, they �nd a mean and median of 0:407 and 0:410, respectively. This is in line with the results
we obtain for the LP estimator for m =�n0:5
�, i.e. 0:440 and 0:423 for the adjusted log-squared
returns and 0:474 and 0:413 for the detrended log turnover ratio. That is, we clearly see the pattern
10From Hurvich & Ray (2003), we know that the LWN is superior to the LW estimator in terms of bias and RMSE
when we are in the context of a standard LMSV model. This is also shown (also for the LPW estimator) in Frederiksen
et al. (2008).
128
that the estimators that explicitly model the perturbation estimate a higher degree of memory than
the estimators that do not, i.e. LP, LW, and LPW. Furthermore, the fractional integration parameter
is in general estimated at a higher level for the two measures of the volatility process than for trading
volume for the LWN and LPWN estimators, whereas they are quite similar when looking at the LP,
LW, and LPW estimators. The �ndings of Bollerslev & Jubinski (1999) may therefore be misleading
as they do not employ estimators that are robust to potential perturbation even though they select a
reasonable low choice of bandwidth, i.e. m =�n0:5
�. But nontheless, we do not know the appropriate
bandwidth choice when the signal is perturbed by some noise term, and therefore using estimators
that have a reasonable rate of convergence is important. The range of the individual estimators is
tabulated in Table 5.
Looking at the t-statistics, the individual memory estimates for the LP, LW, and LPW estimators
are in general di¤erent from zero and one, at a 5% level. This is con�rmed when looking at Table 6.
For the LWN and LPWN estimators, the picture is not as clear-cut. For instance, for the adjusted
log-absolute returns for the LPWN(1,1) estimator we only reject the null that d = 1 in 6 cases when
m =�n0:5
�, whereas the rejection frequency rises to 33 and 42 when m =
�n0:6
�and m =
�n0:7
�,
respectively. This is clearly due to some boundary issue since we get more precision in estimating the
polynomial coe¢ cients when we include more periodogram ordinates.
To give an overall picture of whether or not the respective series potentially follow a perturbed
fractional process (e.g. in the context of the LMSV model, as discussed in the introduction), we have
in Figure 2 plotted the estimated values of d for the two volatility series and the trading volume series
for the 3M common stock11 using the LW and LWN estimators. In addition, we have also plotted
the approximate asymptotic con�dence interval given by plus/minus two asymptotic standard errors
of the respective estimates. The estimates are shown for a range of relevant values of the bandwidth
parameter, m 2 [50; 2000]. Following the assumption on the bandwidth choice of Frederiksen et al.(2008, Theorem 2) and the suggestion by Hurvich & Ray (2003), we emphasize the higher bandwidth
values where the estimates (and con�dence intervals) also appear more stable. The bandwidth values
corresponding to the range n0:5�n0:7 as investigated above are m 2 [105; 682]. The results in Figure 2for the volatility measures (Panel A and Panel B) show that the LW estimates is smaller than the LWN
estimates for essentially all bandwidth choices, and the LW estimate is decreasing in the bandwidth.
This is expected based on theoretical properties of the LW estimator in the case of perturbed fractional
processes (e.g. in the context of the LMSV model). The LWN estimate is higher and shows signs of
nonstationarity for higher bandwidth values, while the LW estimate is nonstationary for low values of
m and nonstationary when m > 100. Looking at the results for the detrended log turnover ratio series,
we see that the estimates for the LW and LWN estimators are statistical indistinguishable from each
other when m < 400. Furthermore, the LWN estimate is borderline nonstationary for most values of
m, while the LW estimate is stationary for m > 430.
Regarding the issue of potential fractional cointegration between volatility and trading volume,
11We focus on the 3M common stock and omit the other shares, as the conclusions drawn are quit similar in an
qualitative sense. Furthermore, we only plot for the LW and LWN estimator. The LP and LPW estimates are similar to
the LW estimates. The LPWN(1,0), LPWN(0,1), and LPWN(1,1) parameterizations are also omitted as these are similar
to the LWN estimates. Although, it should be noted that the estimates from the LPWN(Ry; Rw) parameterizations are
more volatile.
129
we see from Table 3 that there could be some long-run relationship between some of the shares in
the S&P100 composite index. For example, comparing the individual estimates for the IBM common
stock, it cannot be ruled out that the estimated integration orders are statistically indistinguishable.
Of course, the standard errors for the LPWN estimators are so large that in most cases we cannot rule
out that the estimated memory is identical across volatility and trading volume. In Figure 3, we have
plotted the LWN estimate for the adjusted log-squared returns, adjusted log-absolute returns, and
the detrended log turnover ratio for the 3M common stock. For all choices of bandwidth it seems as
though they share a common integration order as they are not statistically di¤erent from each other
(on a 5% signi�cance level). Looking at the rest of the estimates across common stocks, we can not
in the majority of the cases reject the null that they have the same degree of memory.
To sum up, using more appropriate semiparametric estimators, we �nd that the volatility and
trading volume are more persistent than other studies have found, see e.g. Bollerslev & Jubinski
(1999) and Fleming & Kirby (2006) for the S&P100 index, Lobato & Velasco (2000) for the DJIA30,
Liesenfeld (2002) for futures contracts on the DAX index, Gurgul & Wojtowicz (2006) for the DAX
index and the references therein. Additionally, there is evidence that volatility is more persistent in
terms of memory than trading volume. Furthermore, in light of our study and Bollerslev & Jubinski
(1999) there is evidence that in general volatility could be governed by a perturbed fractional process
(e.g. in the context of the LMSV model), whereas this is not the case for trading volume (or at least
not for the majority of the common stocks as seen from Table 4).
Before moving on to testing whether the two series share a common order of fractional integration
and if the two series move together over time, we test whether non-linearities induce the long memory
in the volatility and volume.
5.2 Long memory versus spurious long memory
Tables 7-9 display the estimates d (memory estimate of entire sample) and �d (mean of estimate
from the K sample splits) using the LW and LWN estimators12, the test statistics WS for m 2f400; 600; 800; 1000g and K 2 f2; 4g for the shares 3M, IBM and AA. We have to restrict the sample
size n such that nK 2 N+. We do this by removing the �rst 187 observations, so that n = 11; 000.
Looking at Table 7, the overall evidence of spurious long memory is not supported. For the adjusted
log-squared and adjusted log-absolute returns, d and �d decrease as m increases for the LW estimator.
That the estimated fractional integration parameter decreases (for the LW estimator) is a sign that the
underlying process in fact is a perturbed fractional process as the LW estimator underestimates the
true d in this case, see discussion in the previous subsection. Looking further at the estimator which
should take the perturbation into account, the LWN estimator, we see that the fractional integration
estimate is in the nonstationary region and the memory estimate decreases with m for the detrended
log turnover ratio. But overall, for both semiparametric estimators, we cannot reject the null of equal
12The results for the LP and LPW are omitted as these are similar in a qualitative sense to the LW results. The
LPWN(1,0), LPWN(0,1), and LPWN(1,1) parameterizations are also omitted as these are similar to the LWN results.
Although, we note that when we split the sample in K = 4 we in some cases have problems with convergence for,
especially the LPWN(1,1). This is because we do not have enough periodogram ordinates to pin down the 4 parameters
(3 polynomial terms and the memory estimate).
130
memory across sample splits.
In Table 8, we display the results for the IBM share. The same pattern as in Table 7, i.e. LW
estimates in the stationary region and the LWN estimates in the nonstationary region. For the adjusted
log-absolute return series, we can reject the null that the series is long memory when (m = 800;K = 2)
and (m = 1000;K = 4), whereas we for the LWN estimator only reject when m = 800 and K = 4.
Looking at the detrended log turnover ratio, we reject the null in all but 1 case for the LW estimator,
whereas we for the LWN reject the null in 1 case. To sum up, there is at best weak evidence that the
adjusted log-absolute returns and detrended log turnover ratio for the IBM share are not governed by
a long memory process.
For the last common stock, i.e. AA, Table 9, the long memory hypothesis is supported in all cases
but when looking at the adjusted log-absolute returns for the LW estimator, m 2 f600; 800; 1000g,and K = 2. Furthermore, we again see a decrease in the memory estimate (volatility series only) for
the LW estimator, but not for the LWN estimator. For the detrended log turnover ratio, the fractional
integration estimate is similar and stable across bandwidth choice and estimator.
In Table 10, we have tabulated in how many cases we reject the null of true long memory versus
spurious long memory for the 45 common stocks where we observe a full sample. In general, we
reject the null in more cases when the test statistic is based on the LW estimator. This also holds
when comparing volatility and trading volume. Interestingly, we reject the null in more cases for the
absolute measure than for the squared measure, although only for K = 4 and m 2 f600; 800; 1000gwhen considering the test based on the LWN estimator.
As a �nal note, we also implemented the test for long memory against spurious long memory
using the notion of temporal aggregation, i.e. exploiting the long memory parameter�s invariance to
temporal aggregation, see Ohanissian et al. (2008) and Frederiksen & Nielsen (2008). The results were
in a qualitative sense similar so the results are omitted (avaible from the author upon request).
5.3 Common long-run dependence
Here we will analyze the potential common long-run dependence between return volatility and trading
volume. Instead of estimating the cointegration vector and the memory parameter of the equilibrium
error, we focus on testing whether we can reject the hypothesis of fractional cointegration being present.
This is in line with Nielsen (2004), Hassler, Marmol & Velasco (2000), and Nielsen & Shimotsu (2007).
Tables 11 and 12 present the univariate fractional integration estimates using the LW estimator.
We set the number of periodogram ordinates equal to m = bn�c for � 2 f0:5; 0:6g. Furthermore,Tables 11 and 12 contain the test statistic for the test of equal fractional integration order, rejection
frequency of H12 (for all 45 common stocks), i.e. d1 = d2 = d, where we have set h (n) = h1 = log�1 n
and h (n) = h2 = log�2 n, the estimated eigenvalues of G��d�, and the correlation matrix P
��d�=
K��d��1=2
G��d�K��d��1=2. The bandwidth choice m1 is set equal to
�n0:45
�when m =
�n0:50
�and�
n0:55�when m =
�n0:60
�. The results for the selection procedure for rank determination are displayed
in Tables 13 and 14 where, we as suggested by Nielsen & Shimotsu (2007), use the eigenvalues from
the correlation matrix P��d�in the selection procedure instead of the estimated eigenvalues of G
��d�.
Firstly, looking at Table 11 (Panel A), we con�rm the results from the section on univariate
estimation of the fractional integration parameter for the semiparametric estimators that do not
131
account for the potential perturbation, i.e. decrease in memory estimate when bandwidth is increased,
especially for the volatility series. Furthermore, we cannot reject (5% level) the null for identical
memory estimates, i.e. H12 : d1 = d2, for the AA share for both bandwidths, and for the 3M share
when setting m =�n0:50
�. The same conclusion does not apply when looking at potential long-run
dependence between adjusted log-absolute returns and the detrended log turnover ratio, Table 12
(Panel A), as we here reject in cases but one, i.e. for the AA share when considering the smallest
bandwidth. Overall, there could potentially be a cointegrating relation when considering the AA
share, but looking at the estimated eigenvalues of G��d�and the eigenvalues from the correlation
matrix P��d�, Panels C and D, there is little evidence to suggest that there is a potential cointegrating
relation as none of the eigenvalues are close to zero (relatively). So we expect the rank to be zero, i.e.
they do not move together over time.
Tables 13 and 14 con�rm that for all choices of m1 and v (n) there are no cointegrating relations.
Even for the case where we could not reject equality of integration orders (which is a necessary
condition for there to be cointegration), there is no sign of a cointegrating relation. We know from
Nielsen & Shimotsu (2007) that a large v (n) leads to a non-conservative estimate of r, so the evidence
of there being a cointegrating relation is at best weak.
In Tables 11 (Panel B), 12 (Panel B), and 15, we summarize the results across common stocks.
Tables 11 (Panel B) and 12 (Panel B) displays the rejection frequency of the null of equal fractional
integration orders, i.e. H12 : d~r2 = dv and dj~rj = dv. Firstly, we note that (as simulations in Nielsen
& Shimotsu (2007) show) selecting a smaller h (n) leads to more rejections of the null of identical
integration orders. In the previous subsection (univariate estimation of the fractional integration
parameter), there was evidence that in general the volatility process follows a perturbed fractional
process whereas trading volume did not. We would expect that if the volatility process follows a
perturbed fractional process, then we would reject H12 more often for higher bandwidth because
the estimated memory of the volatility process decreases for higher bandwidth while the estimated
memory for the trading volume stays constant as a function of bandwidth choice. It is clearly seen
that when m =�n0:60
�, we reject more than twice as many shares compared to when the bandwidth
is m =�n0:50
�when looking at the relation between adjusted log-squared returns and the detrended
log turnover ratio. For the hypothesis of identical memory between adjusted log-squared returns and
detrended log turnover ratio, we reject in 15 and 18 cases for h1 and h2, respectively, for the smalles
bandwidth, while this rises to 36 and 38 when the bandwidth is equal to the highest bandwidth.
Interestingly, when we instead test the hypothesis of identical memory between adjusted log-absolute
returns and detrended log turnover ratio, we reject in more cases, i.e. in 30 and 31 cases for h1 and
h2, respectively, for the smallest bandwidth, while this rises to 41 and 41 when the bandwidth is equal
to the largest bandwidth.
Table 15 shows for how many common stocks there is a cointegrating relation between volatility
and trading volume. There is little evidence of a cointegrating relation between volatility and trading
volume as it is only when we select the largest v (n), i.e. v (n) =�m�0:051
�, we estimate a relation for
approximately 7%�20% of the common stocks in the composite index. We note, that the cases where,we �nd a cointegrating relation actually corresponds to cases where, we cannot reject equality of the
speci�c integration orders. To sum up, the evidence of there being a cointegrating relation between
132
return volatility and trading volume is very weak for the majority of the common stocks analyzed.
As a �nal note, we also estimated the cointegrating coe¢ cient � in the cases where dx = dy = d
by means of the fully modi�ed frequency domain least squares of Nielsen & Frederiksen (2008) and
multiple local Whittle setting of Robinson (2008). The condition that there is fractional cointegration
present is only ful�lled in a few cases, i.e. (i) dy = dx = d and (ii) "t = yt��xt � I (d") where d" < d.
This is in line with other studies, e.g. Lobato & Velasco (2000) for the DJIA30 index, Liesenfeld (2002)
for futures contracts on the DAX index, Gurgul & Wojtowicz (2006) for the DAX30 index, showing
that volatility and trading volume might share a common fractional integration order but they do not
in general co-move.
6 Concluding remarks
Using semiparametric methods where we can model the contamination from the short-run dynamics
by a polynomial not only in the signal but also in the noise, we �nd that volatility and the trading
volume process of stock prices are mean reverting with high a degree of persistence, and this persistence
dissipates at a hyperbolic rate. We �nd evidence that the volatility process is a perturbed fractional
process, whereas this is not generally the case for trading volume. A part of this noise is an artifact
of the sampling frequency, i.e. if we used intra daily returns and aggregated this to daily returns, we
would mitigate some of the noise. This again emphasizes using methods that can model this potential
perturbation. Additionally, the long memory estimate of the volatility process is on a mean and
median metric higher than for the trading volume process. The evidence that there exists a common
order of fractional integration between volatility and trading volume is mixed, and the evidence is
weak when using estimators that are not downward biased in the presence of perturbation in the
process. Furthermore, there is little evidence of a cointegrating relation between volatility and trading
volume. Therefore, our �ndings are not in general consistent with a modi�ed version of the MDH,
in which volatility and trading volume are governed by a common latent information-arrival process
exhibiting long memory behavior.
Future research on the volatility-volume relationship will use intra daily observations on return
volatility and trading volume. In this direction, we can do several things. For instance, it is of
interest to analyze intra daily behavior of volatility and trading volume. Furthermore, an extension
of the LPWN estimator to a multivariate setting is relevant as we, directly from this, could develop
a powerful test for equality of fractional integration parameters and set up a testing procedure for
the rank of the spectral density matrix along the lines of Robinson & Yajima (2002) and Nielsen &
Shimotsu (2007).
Additionally, work on estimating a three-factor bivariate mixture model where the common infor-
mation arrival process is assumed to posses long memory according to the Gaussian ARFIMA (1; d; 0)
process and the distinct dynamics of the volume and volatility follow Gaussian AR (1)-processes looks
very promising. To our knowledge, there has not been any empirical investigation modeling the prop-
erties of the volume-volatility system adequately as the asset return volatility and the asset volume
processes might both exhibit long memory and volatility clustering. These empirical �ndings are pos-
sible to replicate in the model of Frederiksen & Nielsen (2009). Furthermore, this model also nest the
133
hypothesis that the volume and the volatility can be fractionally cointegrated. However, this is not
empirically justi�ed even though the series generally exhibit the same degree of long memory.
References
Andersen, T. G. (1996), �Return volatility and trading volume: An information �ow interpretation of
stochastic volatility�, Journal of Finance 51, 169�204.
Andersen, T. G. & Bollerslev, T. (1997a), �Heterogeneous information arrivals and return volatility
dynamics: Uncovering the long-run in high frequency returns�, Journal of Finance 52, 975�1005.
Andersen, T. G. & Bollerslev, T. (1997b), �Intraday periodicity and volatility persistence in �nancial
markets�, Journal of Empirical Finance 4, 115�158.
Andrews, D. W. K. & Guggenberger, P. (2003), �A bias-reduced log-periodogram regression estimator
for the long memory parameter�, Econometrica 71, 675�712.
Andrews, D. W. K. & Sun, Y. (2004), �Adaptive local polynomial Whittle estimation of long-range
dependence�, Econometrica 72, 569�614.
Arteche, J. (2004a), Augmented log-periodogram regression in long memory signal plus noise models,
in �2004 Hawaii International Conference on Statistics, Mathematics and Related Fields�, pp. 108�
119.
Arteche, J. (2004b), �Gaussian semiparametric estimation in long memory in stochastic volatility and
signal plus noise models�, Journal of Econometrics 119, 131�154.
Bollerslev, T. (1986), �Generalized autoregressive conditional heteroscedasticity�, Journal of Econo-
metrics 31, 307�327.
Bollerslev, T. & Jubinski, D. (1999), �Equity trading volume and volatility: Latent information arrivals
and common long-run dependencies�, Journal of Business and Economic Statistics 17, 9�21.
Bollerslev, T. & Mikkelsen, H. O. (1996), �Modeling and pricing long memory in stock market volatil-
ity�, Journal of Econometrics 73, 151�184.
Breidt, F. J., Crato, N. & de Lima, P. (1998), �The detection and estimation of long memory in
stochastic volatility�, Journal of Econometrics 83, 325�348.
Breitung, J. & Hassler, U. (2002), �Inference on the cointegration rank in fractionally integrated
processes�, Journal of Econometrics 110, 167�185.
Clark, P. K. (1973), �A subordinated stochastic process model with �nite variance for speculative
prices�, Econometrica 41, 135�155.
Deo, R. S. & Hurvich, C. M. (2001), �On the log periodogram regression estimator of the memory
parameter in long memory stochastic volatility models�, Econometric Theory 17, 686�710.
134
Deo, R. S. & Hurvich, C. M. (2002), Estimation of long memory in volatility, in P. Doukhan,
G. Oppenhein & M. Taqqu, eds, �Theory and Applications of Long-Range Dependence�, Boston:
Birkhauser.
Diebold, F. X. & Inoue, A. (2001), �Long memory and regime switching�, Journal of Econometrics
105, 131�159.
Ding, Z. & Granger, C. W. J. (1996), �Modeling volatility persistence of speculative returns: a new
approach�, Journal of Econometrics 73, 185�215.
Ding, Z., Granger, C. W. J. & Engle, R. F. (1993), �A long memory property of stock returns and a
new model�, Journal of Empirical Finance 1, 83�106.
Engle, R. F. (1982), �Autoregressive conditional heteroscedasticity with estimates of the variance of
united kingdom in�ation�, Econometrica 50, 987�1007.
Engle, R. F. & Bollerslev, T. (1986), �Modeling the persistence iof conditional variances�, Econometric
Reviews 5, 1�50.
Epps, T. & Epps, M. (1976), �The stochastic dependence in security price changes and transaction
volumes: implications for the mixture-of-distribution hypothesis�, Econometrica 44, 305�321.
Fleming, J. & Kirby, C. (2006), �Long memory in volatility and trading volume�, Working Paper .
Frederiksen, P. H. & Nielsen, F. S. (2008), �Testing for spurious long memory in potentially nonstation-
ary perturbed fractional processes�, CREATES RP 2008-59, University of Aarhus, and Working
Paper, Nordea Markets .
Frederiksen, P. H. & Nielsen, F. S. (2009), �A dynamic long memory bivariate mixture model�, Un-
published Working paper .
Frederiksen, P. H., Nielsen, F. S. & Nielsen, M. Ø. (2008), �Local polynomial Whittle estimation
of perturbed fractional processes�, CREATES RP 2008-29, University of Aarhus, and Working
Paper, Nordea Markets and Cornell University .
Fuller, W. A. (1996), Introduction to statistical time series, Wiley, New York.
Geweke, J. & Porter-Hudak, S. (1983), �The estimation and application of long-memory time series
models�, Journal of Time Series Analysis 4, 221�238.
Glosten, L. & Milgrom, P. (1985), �Bid, ask, and transaction prices in a specialist market with het-
erogeneously informed traders�, Journal of Financial Economics 14, 71�100.
Granger, C. W. J. & Hyung, N. (2004), �Occasional structural breaks and long memory with an
application to the s&p 500 absolute stock returns�, Journal of Empirical Finance 11, 399�421.
Granger, C. W. J. & Terasvirta, T. (1999), �A simple nonlinear time series model with misleading
linear properties�, Economics Letters 62, 161�165.
135
Grossman, S. J. & Stiglitz, J. E. (1980), �On the impossibility of informationally e¢ cient markets�,
The American Economic Review 70, 393�408.
Gurgul, H. & Wojtowicz, T. (2006), �Long memory on the german stock exchange�, Czech Journal of
Economics and Finance 56, 447�468.
Haldrup, N. & Nielsen, M. Ø. (2007), �Estimation of fractional integration in the presence of data
noise�, Computational Statistics and Data Analysis 51, 3100�3114.
Harvey, A. (1998), Long memory in stochastic volatility, in J. Knight & S. Satchell, eds, �Forecasting
Volatility in Financial Markets�, Butterworth-Heinemann, London, pp. 307�320.
Hassler, U. & Breitung, J. (2006), �A residual-based lm-type test against fractional cointegration�,
Econometric Theory 22, 1091�1111.
Hassler, U., Marmol, F. & Velasco, C. (2000), �Fractional cointegrating regression in the presence of
linear time trends�, (138).
Hurvich, C. M., Moulines, E. & Soulier, P. (2005), �Estimating long memory in volatility�, Economet-
rica 73, 1283�1328.
Hurvich, C. M. & Ray, B. K. (2003), �The local Whittle estimator of long-memory stochastic volatility�,
Journal of Financial Econometrics 1, 445�470.
Karpo¤, J. (1987), �The relation between price changes and trading volume: a survey�, Journal of
Financial and Quantitative Analysis 22, 109�126.
Lamoureux, C. G. & Lastrapes, W. D. (1994), �Endogenous trading volume and momentum in stock
return volatility�, Journal of Business and Economic Statistics 12, 253�260.
Liesenfeld, R. (1998), �Dynamic bivariate mixture models: Modeling the behavior of prices and trading
volume�, Journal of Business and Economic Statistics 16, 101�109.
Liesenfeld, R. (2001), �A generalized bivariate mixture model for stock price volatility and trading
volume�, Journal of Econometrics 104, 141�178.
Liesenfeld, R. (2002), �Identifying common long-range dependence in volume and volatility using high-
frequency data�, Working Paper .
Lobato, I. N. & Savin, N. E. (1998), �Real and spurious long-memory properties of stock-market data�,
Journal of Business & Economic Statistics 16, 261�68.
Lobato, I. N. & Velasco, C. (2000), �Long memory in stock-market trading volume�, Journal of Business
and Economic Statistics 18, 570�576.
Mikosch, T. & Starica, C. (2000), �Limit theory for the sample autocorrelations and extremes of a
garch (1, 1) process�, The Annals of Statistics 28, 1427�1451.
136
Nielsen, F. S. (2008), �Fractional cointegration and the �nite sample performance in the presence of
data noise�, Unpublished, Working paper .
Nielsen, M. Ø. (2004), �Local whittle analysis of stationary fractional cointegration and the implied-
realized volatility relation�, Working paper, Cornell University .
Nielsen, M. Ø. & Frederiksen, P. H. (2008), Fully modi�ed narrow-band least squares estimation
of stationary fractional cointegration, Working Papers 1171, Queen�s University, Department of
Economics.
Nielsen, M. Ø. & Shimotsu, K. (2007), �Determining the cointegrating rank in nonstationary fractional
systems by the exact local whittle approach�, Journal of Econometrics 127, 574�596.
Ohanissian, A., Russell, J. R. & Tsay, R. S. (2008), �True or spurious long memory? a new test�,
Journal of Business & Economic Statistics 26(2), 161�175.
Parke, W. R. (1996), �What is a fractional unit root?�, unpublished manuscript (University of North
Carolina, Dept. of Economics).
Robinson, P. M. (1995a), �Gaussian semiparametric estimation of long range dependence�, The Annals
of Statistics 23, 1630�1661.
Robinson, P. M. (1995b), �Log-periodogram regression of time series with long range dependence�, The
Annals of Statistics 23, 1048�1072.
Robinson, P. M. (2008), �Multiple local whittle estimation in stationary systems�, Annals of Statistics,
volume = 36, pages = 2508-2530, key = Keywords: .
Robinson, P. M. & Yajima, Y. (2002), �Determination of cointegrating rank in fractional systems�,
Journal of Econometrics 106, 217�241.
Shimotsu, K. (2006), �Simple (but e¤ective) tests of long memory versus structural breaks�, Working
Paper, Department of Economics, Queen�s University, Canada (1101).
Shimotsu, K. (2007), �Gaussian semiparametric estimation of multivariate fractionally integrated
processes�, Journal of Econometrics 137, 277�310.
Shimotsu, K. & Phillips, P. (2005), �Exact local whittle estimation of fractional integration�, The
Annals of Statistics 33, 1890�1933.
Sun, Y. & Phillips, P. C. B. (2003), �Nonlinear log-periodogram regression for perturbed fractional
processes�, Journal of Econometrics 115, 355�389.
Tauchen, G. E. & Pitts, M. (1983), �The price variability-volume relationship on speculative markets�,
Econometrica 51, 485�505.
Taylor, S. J. (1994), �Modelling stochastic volatility: a review and comparative study�, Mathematical
Finance 4, 183�204.
137
Velasco, C. (1999), �Gaussian semiparametric estimation of non-stationary time series�, Journal of
Time Series Analysis 20, 87�127.
Watanabe, T. (2000), �Bayesian analysis of dynamic bivariate mixture models: Can they explain the
behavior of returns and trading volume?�, Journal of Business and Economic Statistics 18, 199�210.
Watanabe, T. (2003), �The estimation of dynamic bivariate mixture models: Reply to liesenfeld and
richard comments�, Journal of Business and Economic Statistics 21, 577�580.
138
0 10 20 30 40 50
0.00
0.05
0.10
Returns 3MAbsReturns 3M
SqrReturns 3M
0 10 20 30 40 50
0.00
0.05
0.10
0.15 Returns IBMAbsReturns IBM
SqrReturns IBM
0 10 20 30 40 50
0.00
0.05
0.10
Returns AAAbsReturns AA
SqrReturns AA
0 10 20 30 40 50
0.00
0.25
0.50
0.75Detrended turnover ratio 3MDetrended turnover ratio AA
Detrended turnover ratio IBM
Figure 1: Autocorrelation function for continuously compounded percentage returns, adjusted log-
squared returns, and adjusted log-absolute returns for the common stocks 3M (Panel A), IBM (Panel
B), and AA (Panel C). In Panel D the detrended log transformed turnover ratio for the three common
stocks are depicted.
139
Panel A
0,4
0,2
0
0,2
0,4
0,6
0,8
1
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
Band
d
LW LW+/2s.e. LW+/2s.e. LWN LWN+/2s.e. LWN+/2s.e.
Panel B
0,1
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
Band
d
LW LW+/2s.e. LW+/2s.e. LWN LWN+/2s.e. LWN+/2s.e.
Panel C
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
1,1
1,2
50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950
Band
d
LW LW+/2s.e. LW+/2s.e. LWN LWN+/2s.e. LWN+/2s.e.
Figure 2: Local Whittle (LW) and local Whittle with noise (LWN) estimates of the adjusted log-
squared returns (Panel A), adjusted log-absolute returns (Panel B), and detrended log trading volume
(Panel C) for the 3M common stock.140
0,6
0,4
0,2
0
0,2
0,4
0,6
0,8
1
1,2
1,4
Band 149 249 349 449 549 649 749 849 949 1049 1149 1249 1349 1449 1549 1649 1749 1849 1949
Band
d
LWN(trad) LWN(trad)+/2s.e. LWN(abs) LWN(abs)+/2s.e. LWN(sqr) LWN(sqr)+/2s.e.
Figure 3: Local Whittle with noise (LWN) estimates of the adjusted log-squared returns (LWN(sqr)),
adjusted log-absolute returns (LWN(abs)), and detrended log trading volume (LWN(trad)) for the 3M
common stock.
141
Table 1: Summary statistics for returns measured in continuously compounded rates for the common
stocks 3M, IBM, and AA.Sample Full 1 2 3 4 5 6 7 8Panel A: 3MMean 0:03 0:04 0:04 �0:03 0:02 0:05 0:03 0:05 0:02
Std. Dev. 1:43 1:41 1:22 1:55 1:36 1:46 1:15 1:82 1:34
Skewness �0:14 0:19 0:12 0:26 0:45 �2:03 �0:33 0:19 �0:22Excess Kurtosis 6:45 4:76 2:13 1:51 1:54 26:08 4:65 2:41 5:12
Maximum 10:50 8:72 5:78 7:31 6:98 7:06 4:96 10:50 6:88
Minimum �19:59 �9:58 �6:81 �5:85 �4:79 �19:59 �9:04 �10:08 �9:38LB(10) 51:00
(0:00)17:10(0:07)
38:63(0:00)
51:48(0:00)
23:57(0:01)
12:04(0:28)
25:32(0:01)
15:95(0:10)
9:98(0:44)
LB(20) 69:92(0:00)
29:13(0:09)
42:39(0:00)
71:76(0:00)
37:46(0:01)
20:74(0:41)
38:69(0:01)
31:91(0:04)
28:26(0:10)
Panel B: IBMMean 0:03 0:09 0:01 0:01 0:03 �0:01 �0:01 0:11 �0:01Std. Dev. 1:60 1:17 1:28 1:46 1:45 1:31 1:68 2:47 1:58
Skewness 0:05 0:16 0:21 0:36 0:46 �1:03 0:02 �0:04 0:05
Excess Kurtosis 7:56 1:30 1:79 4:12 1:53 8:99 6:09 5:15 7:07
Maximum 12:37 1:29 6:79 9:86 8:15 4:56 11:09 12:37 10:66
Minimum �16:89 �4:22 �5:49 �9:13 �4:74 �13:35 �11:36 �16:89 �10:67LB(10) 7:78
(0:65)10:93(0:36)
34:39(0:00)
7:19(0:71)
23:85(0:01)
10:17(0:43)
8:33(0:59)
13:31(0:21)
22:06(0:02)
LB(20) 37:69(0:01)
23:26(0:28)
52:90(0:00)
25:54(0:18)
32:05(0:04)
18:51(0:55)
17:51(0:62)
28:51(0:10)
42:59(0:00)
Panel C: AAMean 0:02 0:02 0:00 0:01 0:01 0:05 0:04 0:09 �0:03Std. Dev. 1:85 1:35 1:64 1:92 1:87 1:83 1:64 2:34 2:04
Skewness �0:18 0:09 0:22 �0:55 0:37 �2:36 0:26 0:43 �0:16Excess Kurtosis 7:08 0:74 0:95 5:49 1:11 36:39 1:19 2:85 2:29
Maximum 13:15 4:95 6:64 7:82 7:76 7:47 8:12 13:15 8:67
Minimum �27:29 �5:12 �6:21 �12:59 �5:49 �27:29 �5:72 �10:55 �11:66LB(10) 78:33
(0:00)31:79(0:00)
30:80(0:00)
32:81(0:00)
27:42(0:00)
33:98(0:00)
9:04(0:53)
20:39(0:03)
14:52(0:15)
LB(20) 95:08(0:00)
38:27(0:01)
45:06(0:00)
53:93(0:00)
34:97(0:00)
50:37(0:00)
24:89(0:21)
44:00(0:00)
25:44(0:19)
Note: Summary statistics are based on continuously compounded percentage returns, corrected for
dividends and stock splits, multiplied by 100, i.e. Rjt = log(Pjt=Pj;t�1) for j = 3M, IBM, AA.
Full sample covers observations from July 2 1962 through December 31 2006 and consists of 11; 187
observations. The subsamples are; 1: 1962.07-1968.01, 2: 1968.02-1973.04, 3: 1973.10-1979.03, 4:
1979.04-1984.09, 5: 1984.10-1990.04, 6: 1990.05-1995.11, 7: 1995.12-2001.05, and 8: 2001.06-2006.12.
All subsamples have 1398 observations. The Ljung-Box Portmanteau statistic (LB) tests for 10th and
20th order autocorrelation in returns. p - values are provided in parentheses.
142
Table 2: Summary statistics for detrended log turnover ratio for the common stocks 3M, IBM, and
AA.Sample Full 1 2 3 4 5 6 7 8Panel A: 3MMean 0:00 0:09 �0:22 �0:10 0:18 0:39 �0:37 �0:12 0:14
Std. Dev. 0:55 0:46 0:57 0:49 0:53 0:49 0:51 0:46 0:44
Skewness 0:26 0:44 0:83 0:21 0:16 �0:04 0:54 0:48 0:51
Excess Kurtosis 0:54 1:18 2:64 0:30 0:44 0:37 0:61 1:27 1:31
Maximum 3:46 2:13 3:46 1:65 2:20 2:19 1:86 2:42 2:14
Minimum �2:40 �1:62 �1:97 �1:89 �2:40 �1:80 �1:96 �1:85 �1:52LB(10) 21106
(0:00)1080(0:00)
804(0:00)
1166(0:00)
912(0:00)
1609(0:00)
1332(0:00)
2486(0:00)
2783(0:00)
LB(20) 35544(0:00)
1229(0:09)
1342(0:00)
1580(0:00)
1260(0:00)
2563(0:00)
2201(0:00)
3902(0:00)
4121(0:00)
Panel B: IBMMean 0:00 �0:03 �0:05 0:04 0:07 0:11 �0:12 �0:07 0:05
Std. Dev. 0:45 0:59 0:44 0:40 0:38 0:40 0:44 0:47 0:37
Skewness 0:26 0:58 0:56 0:38 0:27 0:06 0:45 �0:16 0:24
Excess Kurtosis 1:46 0:32 0:82 0:53 �0:01 0:00 1:07 5:29 1:49
Maximum 2:05 1:87 1:98 1:73 1:30 1:33 1:81 2:05 1:96
Minimum �4:15 �2:01 �1:20 �1:26 �1:06 �1:17 �1:66 �4:15 �1:54LB(10) 18366
(0:00)4513(0:00)
1516(0:00)
2066(0:00)
1234(0:00)
1769(0:00)
1311(0:00)
1467(0:00)
2301(0:00)
LB(20) 25875(0:00)
7263(0:00)
1876(0:00)
2675(0:00)
1471(0:00)
2426(0:00)
1608(0:00)
2054(0:00)
2848(0:00)
Panel C: AAMean 0:00 0:01 �0:03 �0:01 �0:06 0:23 �0:07 �0:08 0:00
Std. Dev. 0:59 0:58 0:69 0:66 0:67 0:58 0:52 0:46 0:41
Skewness 0:05 0:20 0:28 �0:20 �0:02 �0:18 0:03 0:10 0:44
Excess Kurtosis 0:68 1:27 0:43 0:06 0:25 0:43 �0:01 0:67 0:97
Maximum 2:98 2:65 2:59 1:93 2:98 2:77 1:66 1:74 1:98
Minimum �3:47 �3:47 �2:18 �2:61 �2:37 �2:01 �1:81 �2:00 �1:46LB(10) 8548
(0:00)1357(0:00)
596(0:00)
489(0:00)
1554(0:00)
1151(0:00)
849(0:00)
1093(0:00)
2249(0:00)
LB(20) 11485(0:00)
1886(0:00)
807(0:00)
522(0:00)
2025(0:00)
1480(0:00)
1051(0:00)
1240(0:00)
3521(0:00)
Note: Summary statistics are based on the detrended turnover ratio for the shares 3M, IBM, and AA.
Full sample covers observations from July 2 1962 through December 31 2006 and consists of 11; 187
observations. The subsamples are; 1: 1962.07-1968.01, 2: 196802-1973.04, 3: 1973.10-1979.03, 4:
1979.04-1984.09, 5: 1984.10-1990.04, 6: 1990.05-1995.11, 7: 1995.12-2001.05, and 8: 2001.06-2006.12.
All subsamples have 1398 observations. The Ljung-Box Portmanteau statistic (LB) tests for 10th and
20th order autocorrelation in returns. p - values are provided in parentheses.
143
Table 3: Estimated fractional integration parameter for the daily adjusted log-squared returns, ad-
justed log-absolute returns, and detrended turnover ratio for the common stocks 3M, IBM, and AA.~r23M j~r3M j v3M ~r2IBM j~rIBM j vIBM ~r2AA j~rAAj vAA
Panel A: m =�n0:5
�dLP 0:418
(0:062)0:404(0:062)
0:477(0:062)
0:536(0:062)
0:492(0:062)
0:671(0:062)
0:410(0:062)
0:401(0:062)
0:339(0:062)
dLW 0:447(0:048)
0:406(0:048)
0:527(0:048)
0:485(0:048)
0:452(0:048)
0:641(0:048)
0:427(0:048)
0:378(0:048)
0:349(0:048)
dLPW 0:486a(0:073)
0:481a(0:073)
0:626a(0:073)
0:570a(0:073)
0:541a(0:073)
0:583a(0:073)
0:475a(0:073)
0:486a(0:073)
0:383a(0:073)
dLWN 0:479b(0:099)
0:511b(0:096)
0:811b(0:078)
0:616b(0:088)
0:608b(0:088)
0:562b(0:092)
0:577b(0:091)
0:659b(0:085)
0:352(0:117)
dLPWN(1;0) 0:479(0:149)
0:510(0:144)
0:808b(0:118)
0:616b(0:132)
0:607b(0:133)
0:519a(0:143)
0:693ab(0:125)
0:658b(0:128)
0:351(0:177)
dLPWN(0;1) 0:479(0:196)
0:511(0:192)
0:811b(0:180)
0:617b(0:185)
0:609b(0:186)
0:562(0:188)
0:577b(0:187)
0:659b(0:183)
0:266c(0:247)
dLPWN(1;1) 0:479(0:294)
0:510(0:289)
0:808b(0:270)
0:616(0:278)
0:608(0:279)
0:413a(0:307)
0:576(0:281)
0:658b(0:275)
0:267(0:370)
Panel B: m =�n0:6
�dLP 0:344
(0:039)0:302(0:039)
0:481(0:039)
0:386(0:039)
0:289(0:039)
0:570(0:039)
0:303(0:039)
0:240(0:039)
0:378(0:039)
dLW 0:335(0:030)
0:283(0:030)
0:475(0:030)
0:391(0:030)
0:336(0:030)
0:523(0:030)
0:314(0:030)
0:254(0:030)
0:367(0:030)
dLPW 0:411a(0:045)
0:362a(0:045)
0:528a(0:045)
0:477a(0:045)
0:438a(0:045)
0:574a(0:045)
0:412a(0:045)
0:368a(0:045)
0:331a(0:045)
dLWN 0:564b(0:057)
0:588b(0:056)
0:611b(0:055)
0:610b(0:055)
0:635b(0:054)
0:655b(0:053)
0:624b(0:055)
0:709b(0:052)
0:306b(0:080)
dLPWN(1;0) 0:561(0:086)
0:583b(0:085)
0:829ab(0:073)
0:605(0:083)
0:629b(0:082)
0:688a(0:079)
0:618b(0:082)
0:700ab(0:078)
0:391a(0:104)
dLPWN(0;1) 0:564b(0:118)
0:588b(0:117)
0:611b(0:116)
0:610b(0:116)
0:635b(0:116)
0:655(0:115)
0:624b(0:116)
0:709bc(0:114)
0:371(0:133)
dLPWN(1;1) 0:561(0:178)
0:583(0:176)
0:837abc(0:170)
0:605(0:175)
0:629(0:174)
0:690(0:172)
0:618b(0:174)
0:701b(0:172)
0:321(0:212)
Panel C: m =�n0:7
�dLP 0:204
(0:024)0:156(0:024)
0:394(0:024)
0:261(0:024)
0:183(0:024)
0:477(0:024)
0:186(0:024)
0:133(0:024)
0:364(0:024)
dLW 0:228(0:019)
0:184(0:019)
0:418(0:019)
0:273(0:019)
0:232(0:019)
0:460(0:019)
0:230(0:019)
0:177(0:019)
0:355(0:019)
dLPW 0:322a(0:028)
0:261a(0:028)
0:468a(0:028)
0:389a(0:028)
0:329a(0:028)
0:512a(0:028)
0:308a(0:028)
0:259a(0:028)
0:342a(0:028)
dLWN 0:609b(0:034)
0:628b(0:034)
0:547b(0:036)
0:650b(0:033)
0:652b(0:033)
0:579b(0:035)
0:589b(0:035)
0:671b(0:033)
0:343(0:047)
dLPWN(1;0) 0:548(0:054)
0:544(0:055)
0:633(0:051)
0:572(0:053)
0:592(0:052)
0:660(0:050)
0:615b(0:052)
0:564ab(0:054)
0:324(0:073)
dLPWN(0;1) 0:593bc(0:073)
0:606bc(0:073)
0:602(0:073)
0:622bc(0:073)
0:638bc(0:072)
0:628b(0:073)
0:575b(0:074)
0:646bc(0:072)
0:406(0:081)
dLPWN(1;1) 0:562(0:111)
0:608(0:110)
0:624(0:109)
0:592(0:110)
0:614(0:110)
0:651(0:109)
0:612(0:110)
0:661b(0:108)
0:316(0:134)
Notes: Asymptotic standard errors are in parentheses. a, b, and c denotes signi�cance at a 5% level
for the polynomial coe¢ cients �y; �w;0; and �w;1, respectively.
144
Table 4: Mean and median of the 45 estimated fractional integration parameter for adjusted log-
squared returns, adjusted log-absolute returns, and detrended turnover ratio for the SP100 composite
index.�d~r2
�dj~rj �dv d~r2 dj~rj dv
Panel A: m =�n0:5
�dLP 0:440 0:391 0:474 0:423 0:394 0:483
dLW 0:442 0:395 0:493 0:445 0:395 0:500
dLPW 0:502 0:492 0:511 0:507 0:508 0:528
dLWN 0:585 0:665 0:544 0:589 0:664 0:555
dLPWN(1;0) 0:603 0:663 0:562 0:614 0:674 0:573
dLPWN(0;1) 0:592 0:666 0:552 0:609 0:664 0:572
dLPWN(1;1) 0:667 0:674 0:544 0:609 0:672 0:529
Panel B: m =�n0:6
�dLP 0:333 0:273 0:430 0:336 0:277 0:434
dLW 0:343 0:293 0:440 0:346 0:294 0:439
dLPW 0:429 0:383 0:473 0:429 0:381 0:485
dLWN 0:597 0:637 0:526 0:606 0:635 0:545
dLPWN(1;0) 0:610 0:646 0:604 0:617 0:644 0:597
dLPWN(0;1) 0:605 0:646 0:568 0:606 0:636 0:598
dLPWN(1;1) 0:604 0:656 0:612 0:611 0:656 0:608
Panel C: m =�n0:7
�dLP 0:225 0:175 0:385 0:222 0:172 0:385
dLW 0:249 0:205 0:397 0:246 0:201 0:401
dLPW 0:330 0:279 0:429 0:327 0:275 0:434
dLWN 0:591 0:618 0:484 0:579 0:619 0:496
dLPWN(1;0) 0:607 0:631 0:552 0:610 0:618 0:556
dLPWN(0;1) 0:591 0:617 0:533 0:602 0:627 0:541
dLPWN(1;1) 0:609 0:636 0:539 0:604 0:622 0:553
Notes: �d and d denote the mean and median, respectively.
145
Table5:Rangeofmemoryestimatesfortheadjustedlog-squaredreturns,adjustedlog-absolutereturns,andthedetrendedlogturnoverratio.
m=� n0:
5�
m=� n0:
6�
m=� n0:
7�
� ~r2 min;~r2 max
�� j~rj m
in;j~rj max
�(vmin;vmax)
� ~r2 min;~r2 max
�� j~rj m
in;j~rj max
�(vmin;vmax)
� ~r2 min;~r2 max
�� j~rj m
in;j~rj max
�(vmin;vmax)
LP
� :285
(:062);:592
(:062)�
� :214
(:062);:539
(:062)�
� :287
(:062);:671
(:062)�
� :213
(:039);:423
(:039)�
� :189
(:039);:369
(:039)�
� :292
(:039);:570
(:039)�
� :176
(:024);:293
(:024)�
� :114
(:024);:243
(:024)�
� :300
(:024);:477
(:024)�
LW� :3
14
(:048);:553
(:048)�
� :271
(:048);:521
(:048)�
� :317
(:048);:641
(:048)�
� :283
(:030);:422
(:030)�
� :199
(:030);:373
(:030)�
� :292
(:030);:537
(:030)�
� :189
(:019);:299
(:019)�
� :138
(:019);:257
(:019)�
� :301
(:019);:467
(:019)�
LPW
� :326
(:073);:687
(:073)�
� :299
(:073);:667
(:073)�
� :275
(:073);:706
(:073)�
� :335
(:045);:519
(:045)�
� :299
(:045);:483
(:045)�
� :270
(:045);:582
(:045)�
� :273
(:028);:397
(:028)�
� :198
(:028);:354
(:028)�
� :292
(:028);:535
(:028)�
LWN
� :334
(:122);:828
(:078)�
� :401
(:110);:938
(:075)�
� :188
(:179);:849
(:078)�
� :409
(:068);:799
(:049)�
� :437
(:065);:900
(:047)�
� :221
(:099);:693
(:052)�
� :472
(:039);:812
(:030)�
� :446
(:040);:906
(:029)�
� :296
(:051);:613
(:034)�
LPWN(1,0)
� :366
(:173);:827
(:117)�
� :325
(:185);:937
(:112)�
� :028
(1:380);:848
(:116)�
� :324
(:116);:790
(:074)�
� :368
(:108);:895
(:071)�
� :296
(:123);:862
(:072)�
� :414
(:063);:914
(:044)�
� :456
(:060);:989
(:043)�
� :316
(:074);:838
(:045)�
LPWN(0,1)
� :341
(:220);:828
(:180)�
� :401
(:207);:938
(:180)�
� :016
(2:385);:849
(:180)�
� :409
(:129);:799
(:113)�
� :437
(:126);:916
(:113)�
� :320
(:142);:722
(:114)�
� :432
(:079);:795
(:071)�
� :450
(:078);:887
(:071)�
� :258
(:098);:701
(:072)�
LPWN(1,1)
� :313
(:343);:827
(:270)�
� :361
(:323);:938
(:270)�
� :018
(3:196);:848
(:270)�
� :377
(:199);:790
(:171)�
� :446
(:188);:895
(:170)�
� :298
(:220);:880
(:170)�
� :428
(:120);:926
(:107)�
� :453
(:117);:987
(:107)�
� :193
(:174);:826
(:107)�
Notes:Tableisbasedonthecommonstockswhereweobservethefullsampleofdailyobservations,i.e.45shares.Below
(min;max)arethe
respectivestandarderrorsinparenthesesforthespeci�cestimate.
146
Table6:Rejectionfrequencyofthenullthatd=0andd=1fortheadjustedlog-squaredreturns,adjustedlog-absolutereturns,andthe
detrendedlogturnoverratio.
m=� n0:
5�
m=� n0:
6�
m=� n0:
7�
~r2
j~rj
v~r2
j~rj
v~r2
j~rj
v
d=0
d=1
d=0
d=1
d=0
d=1
d=0
d=1
d=0
d=1
d=0
d=1
d=0
d=1
d=0
d=1
d=0
d=1
LP
4545
4545
4545
4545
4545
4545
4545
4545
4545
LW45
4545
4545
4545
4545
4545
4545
4545
4545
45LPW
4545
4545
4545
4545
4545
4545
4545
4545
4545
LWN
4545
4544
4343
4545
4545
4545
4545
4545
4545
LPWN(1,0)
4544
4538
4139
4545
4544
4545
4545
4544
4545
LPWN(0,1)
4438
4525
4036
4545
4541
4545
4545
4544
4545
LPWN(1,1)
3911
416
2921
4540
4533
4334
4544
4542
4444
Notes:Tableisbasedonthecommonstockswhereweobservethefullsampleofdailyobservations,i.e.45shares.Rejectionofthenullis
basedontheone-sidedalternativeata5%
signi�cancelevel.
147
Table 7: Results for testing long memory against spurious long memory using the sample splitting
methodology for the 3M common stock.m dLW �dLW W
(LW )S dLWN
�dLWN W(LWN)S
K = 2 K = 4 K = 2 K = 4 K = 2 K = 4 K = 2 K = 4
Panel A: ~r2
400 0:276 0:273 0:270 0:699 1:978 0:614 0:658 0:757 1:554 1:179
600 0:243 0:242 0:239 1:026 3:215 0:595 0:631 0:735 1:416 1:339
800 0:223 0:221 0:215 1:479 1:421 0:574 0:609 0:659 1:139 3:012
1000 0:204 0:201 0:197 0:890 0:459 0:569 0:605 0:648 1:499 3:908
Panel B: j~rj400 0:225 0:220 0:216 2:520 3:395 0:655 0:693 0:744 1:299 0:615
600 0:197 0:193 0:189 1:977 3:125 0:631 0:663 0:730 1:934 0:950
800 0:183 0:181 0:177 2:791 1:972 0:589 0:621 0:631 2:109 4:081
1000 0:168 0:165 0:160 3:115 1:699 0:574 0:608 0:651 2:028 2:884
Panel C: v400 0:459 0:459 0:490 1:416 3:706 0:575 0:588 0:637 2:269 1:712
600 0:425 0:423 0:450 0:928 1:313 0:564 0:574 0:610 2:690 6:220
800 0:401 0:401 0:423 0:693 1:460 0:559 0:565 0:611 2:531 6:575
1000 0:387 0:386 0:405 1:528 1:166 0:543 0:548 0:599 1:580 9:071�
Notes: � denotes rejection of the null at the 5% level. WS is �2 distributed with critical values,
�20:95 (1) = 3:84 and �20:95 (3) = 7:82.
Table 8: Results for testing long memory against spurious long memory using the sample splitting
methodology for the IBM common stock.m dLW �dLW W
(LW )S dLWN
�dLWN W(LWN)S
K = 2 K = 4 K = 2 K = 4 K = 2 K = 4 K = 2 K = 4
Panel A: ~r2
400 0:344 0:342 0:329 0:404 2:221 0:606 0:622 0:647 0:203 3:827
600 0:292 0:287 0:274 0:848 5:306 0:629 0:649 0:673 0:057 2:094
800 0:264 0:257 0:248 1:055 2:876 0:627 0:649 0:660 0:033 3:760
1000 0:253 0:246 0:235 0:845 7:289 0:595 0:613 0:626 0:105 4:295
Panel B: j~rj400 0:289 0:281 0:263 1:765 5:073 0:636 0:655 0:648 0:030 7:350
600 0:245 0:237 0:219 3:045 7:820 0:644 0:662 0:675 0:006 6:283
800 0:227 0:218 0:204 4:619� 7:029 0:617 0:634 0:637 0:174 8:619�
1000 0:215 0:207 0:193 2:884 11:617� 0:586 0:600 0:602 0:003 7:534
Panel C: v400 0:481 0:460 0:460 2:481 12:192� 0:634 0:654 0:581 0:572 6:744
600 0:471 0:454 0:451 6:819� 18:927� 0:569 0:560 0:543 0:092 5:353
800 0:463 0:445 0:449 6:185� 21:157� 0:535 0:523 0:495 1:001 7:695
1000 0:449 0:433 0:437 6:374� 22:885� 0:532 0:516 0:494 1:623 9:602�
Notes: � denotes rejection of the null at the 5% level. WS is �2 distributed with critical values,
�20:95 (1) = 3:84 and �20:95 (3) = 7:82.
148
Table 9: Results for testing long memory against spurious long memory using the sample splitting
methodology for the AA common stock.m dLW �dLW W
(LW )S dLWN
�dLWN W(LWN)S
K = 2 K = 4 K = 2 K = 4 K = 2 K = 4 K = 2 K = 4
Panel A: ~r2
400 0:275 0:273 0:253 1:782 3:556 0:626 0:660 0:754 0:182 1:516
600 0:241 0:238 0:218 1:443 5:091 0:605 0:629 0:660 0:008 0:698
800 0:210 0:207 0:190 2:096 4:378 0:619 0:642 0:649 0:051 1:789
1000 0:195 0:190 0:174 1:231 2:120 0:604 0:628 0:619 0:028 2:173
Panel B: j~rj400 0:227 0:218 0:197 2:901 4:880 0:676 0:706 0:768 0:042 2:082
600 0:188 0:180 0:161 3:983� 5:867 0:679 0:706 0:676 0:055 4:625
800 0:163 0:156 0:139 5:999� 5:649 0:689 0:713 0:657 0:265 5:714
1000 0:157 0:149 0:133 3:850� 2:664 0:640 0:661 0:595 0:001 9:015�
Panel C: v400 0:342 0:349 0:352 0:356 2:410 0:386 0:398 0:413 0:109 1:910
600 0:341 0:343 0:341 0:053 3:382 0:377 0:392 0:404 0:703 0:643
800 0:348 0:350 0:348 0:041 1:789 0:348 0:361 0:381 0:140 1:571
1000 0:344 0:347 0:345 0:030 1:720 0:356 0:364 0:386 0:001 0:617
Notes: � denotes rejection of the null at the 5% level. WS is �2 distributed with critical values,
�20:95 (1) = 3:84 and �20:95 (3) = 7:82.
Table 10: Rejection frequency of the null that the adjusted log-squared returns, adjusted log-absolute
returns, and detrended log turnover ratios are long memory processes using the sample splitting
methodology.m W
(LW )S W
(LWN)S
K = 2 K = 4 K = 2 K = 4
Panel A: ~r2
400 7 10 5 4
600 7 13 4 3
800 8 17 2 2
1000 7 18 3 2
Panel B: j~rj400 5 14 5 4
600 10 20 4 6
800 11 22 2 7
1000 10 22 2 8
Panel C: v400 8 6 0 4
600 7 6 0 7
800 7 10 1 10
1000 6 10 3 10
Note: Table is based on the common stocks where we observe the full sample of daily observations,
i.e. 45 shares.
149
Table 11: LW estimates of the fractional integration orders, joint test of pairwise equality, rejection
frequency of the joint test of pairwise equality, and estimated eigenvalues for the 3M, IBM, and AA
common stocks for the adjusted log-squared return and detrended log turnover ratio.3M IBM AA
Panel A: LW estimates of dm =
�n0:5
�~r2 0:447
(0:048)0:485(0:048)
0:427(0:048)
v 0:527(0:048)
0:641(0:048)
0:349(0:048)
T12 (h1) 1:55 5:37 0:63
T12 (h2) 1:84 6:39 0:75
m =�n0:6
�~r2 0:335
(0:030)0:391(0:030)
0:314(0:030)
v 0:475(0:030)
0:523(0:030)
0:367(0:030)
T12 (h1) 8:61� 22:75� 1:42
T12 (h2) 10:26� 27:04� 1:69
Panel B: Rejection frequency of H12T12 (h1) T12 (h2)
m =�n0:50
�15 18
m =�n0:60
�36 38
Panel C: Estimated eigenvalues for 10; 000� G��d��
�1 �2 �1 �2 �1 �2
m1 =�n0:45
�2583 50:67 1953 24:96 3868 313
m1 =�n0:55
�4209 97:88 2787 40:91 5264 274
Panel D: Estimated eigenvalues for P��d��
�1 �2 �1 �2 �1 �2
m1 =�n0:45
�1:06 0:94 1:11 0:89 1:04 0:96
m1 =�n0:55
�1:14 0:85 1:08 0:91 1:11 0:88
Notes: � denotes rejection of the null at the 5% level. T12 (hi) is the joint test of pairwise equality of the
integration level, where hi for i = 1; 2 is a bandwidth choice equal to h1 = log�1 n and h2 = log�2 n,
respectively. �i for i = 1; 2 is the ith eigenvalues of 10; 000� G��d��and P
��d��.
150
Table 12: LW estimates of the fractional integration orders, joint test of pairwise equality, rejection
frequency of the joint test of pairwise equality, and estimated eigenvalues for the 3M, IBM, and AA
common stocks for the adjusted log-absolute return and detrended log turnover ratio.3M IBM AA
Panel A: LW estimates of dm =
�n0:5
�j~rj 0:406
(0:048)0:462(0:048)
0:361(0:048)
v 0:527(0:048)
0:672(0:048)
0:359(0:048)
T12 (h1) 3:31� 7:64� 0:00
T12 (h2) 3:94� 9:07� 0:01
m =�n0:6
�j~rj 0:283
(0:030)0:336(0:030)
0:254(0:030)
v 0:475(0:030)
0:523(0:030)
0:367(0:030)
T12 (h1) 16:45� 35:57� 6:27�
T12 (h2) 19:62� 42:30� 7:46�
Panel B: Rejection frequency of H12T12 (h1) T12 (h2)
m =�n0:50
�30 31
m =�n0:60
�41 41
Panel C: Estimated eigenvalues for 10; 000� G��d��
�1 �2 �1 �2 �1 �2
m1 =�n0:45
�1147 50:37 772 25:16 2012 312
m1 =�n0:55
�2064 97:07 1351 40:77 2793 273
Panel D: Estimated eigenvalues for P��d��
�1 �2 �1 �2 �1 �2
m1 =�n0:45
�1:09 0:90 1:06 0:94 1:05 0:94
m1 =�n0:55
�1:16 0:83 1:10 0:89 1:12 0:87
Notes: � denotes rejection of the null at the 5% level. T12 (hi) is the joint test of pairwise equality of the
integration level, where hi for i = 1; 2 is a bandwidth choice equal to h1 = log�1 n and h2 = log�2 n,
respectively. �i for i = 1; 2 is the ith eigenvalues of 10; 000� G��d��and P
��d��.
151
Table 13: Rank estimates for the 3M, IBM, and AA common stocks for the adjusted log-squared
returns and detrended log turnover ratio.L (u) v (n) = m�0:45
1 v (n) = m�0:351 v (n) = m�0:25
1 v (n) = m�0:151 v (n) = m�0:05
1
Panel A: 3Mm1 =
�n0:45
�L (0) �1:69 �1:54 �1:29 �0:93 �0:38L (1) �0:91 �0:83 �0:71 �0:53 �0:25r 0 0 0 0 0
m1 =�n0:55
�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:04 �0:97 �0:86 �0:67 �0:36r 0 0 0 0 0
Panel B: IBMm1 =
�n0:45
�L (0) �1:69 �1:54 �1:29 �0:93 �0:37L (1) �0:95 �0:87 �0:75 �0:57 �0:29r 0 0 0 0 0
m1 =�n0:55
�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �0:98 �0:92 �0:81 �0:62 �0:31r 0 0 0 0 0
Panel C: AAm1 =
�n0:45
�L (0) �1:69 �1:54 �1:29 �0:93 �0:38L (1) �0:89 �0:81 �0:69 �0:50 �0:23r 0 0 0 0 0
m1 =�n0:55
�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:01 �0:94 �0:83 �0:64 �0:33r 0 0 0 0 0
Notes: L (u) denotes the value of the criteria function for u = 0; 1. r is the estimated rank of P��d��.
152
Table 14: Rank estimates for the 3M, IBM, and AA common stocks for the adjusted log-absolute
returns and detrended log turnover ratio.L (u) v (n) = m�0:45
1 v (n) = m�0:351 v (n) = m�0:25
1 v (n) = m�0:151 v (n) = m�0:05
1
Panel A: 3Mm1 =
�n0:45
�L (0) �1:69 �1:53 �1:29 �0:93 �0:37L (1) �0:94 �0:86 �0:74 �0:56 �0:28r 0 0 0 0 0
m1 =�n0:55
�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:06 �1:00 �0:88 �0:70 �0:39r 0 0 0 0 0
Panel B: IBMm1 =
�n0:45
�L (0) �1:69 �1:53 �1:29 �0:93 �0:37L (1) �0:90 �0:82 �0:71 �0:52 �0:24r 0 0 0 0 0
m1 =�n0:55
�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:00 �0:93 �0:82 �0:64 �0:33r 0 0 0 0 0
Panel C: AAm =
�n0:45
�L (0) �1:69 �1:53 �1:29 �0:93 �0:37L (1) �0:90 �0:82 �0:70 �0:52 �0:24r 0 0 0 0 0
m =�n0:55
�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:02 �0:95 �0:84 �0:66 �0:35r 0 0 0 0 0
Notes: L (u) denotes the value of the criteria function for u = 0; 1. r is the estimated rank of P��d��.
Table 15: Frequency of how many times we estimate 1 cointegrating relation.jm�0:451
k jm�0:351
k jm�0:251
k jm�0:151
k jm�0:051
kPanel A:
�~r2; v
�m1 =
�n0:45
�0 0 0 0 5
m1 =�n0:55
�0 0 0 0 3
Panel B: (j~rj ; v)m1 =
�n0:45
�0 0 0 0 9
m1 =�n0:55
�0 0 0 0 6
Note: r is the estimated rank of P��d��.
153
SCHOOL OF ECONOMICS AND MANAGEMENT UNIVERSITY OF AARHUS - UNIVERSITETSPARKEN - BUILDING 1322
DK-8000 AARHUS C – TEL. +45 8942 1111 - www.econ.au.dk
PhD Theses: 1999-4 Philipp J.H. Schröder, Aspects of Transition in Central and Eastern Europe. 1999-5 Robert Rene Dogonowski, Aspects of Classical and Contemporary European Fiscal
Policy Issues. 1999-6 Peter Raahauge, Dynamic Programming in Computational Economics. 1999-7 Torben Dall Schmidt, Social Insurance, Incentives and Economic Integration. 1999 Jørgen Vig Pedersen, An Asset-Based Explanation of Strategic Advantage. 1999 Bjarke Jensen, Five Essays on Contingent Claim Valuation. 1999 Ken Lamdahl Bechmann, Five Essays on Convertible Bonds and Capital Structure
Theory. 1999 Birgitte Holt Andersen, Structural Analysis of the Earth Observation Industry. 2000-1 Jakob Roland Munch, Economic Integration and Industrial Location in Unionized
Countries. 2000-2 Christian Møller Dahl, Essays on Nonlinear Econometric Time Series Modelling. 2000-3 Mette C. Deding, Aspects of Income Distributions in a Labour Market Perspective. 2000-4 Michael Jansson, Testing the Null Hypothesis of Cointegration. 2000-5 Svend Jespersen, Aspects of Economic Growth and the Distribution of Wealth. 2001-1 Michael Svarer, Application of Search Models. 2001-2 Morten Berg Jensen, Financial Models for Stocks, Interest Rates, and Options: Theory
and Estimation. 2001-3 Niels C. Beier, Propagation of Nominal Shocks in Open Economies. 2001-4 Mette Verner, Causes and Consequences of Interrruptions in the Labour Market. 2001-5 Tobias Nybo Rasmussen, Dynamic Computable General Equilibrium Models: Essays
on Environmental Regulation and Economic Growth.
2001-6 Søren Vester Sørensen, Three Essays on the Propagation of Monetary Shocks in Open Economies.
2001-7 Rasmus Højbjerg Jacobsen, Essays on Endogenous Policies under Labor Union
Influence and their Implications. 2001-8 Peter Ejler Storgaard, Price Rigidity in Closed and Open Economies: Causes and
Effects. 2001 Charlotte Strunk-Hansen, Studies in Financial Econometrics. 2002-1 Mette Rose Skaksen, Multinational Enterprises: Interactions with the Labor Market. 2002-2 Nikolaj Malchow-Møller, Dynamic Behaviour and Agricultural Households in
Nicaragua. 2002-3 Boriss Siliverstovs, Multicointegration, Nonlinearity, and Forecasting. 2002-4 Søren Tang Sørensen, Aspects of Sequential Auctions and Industrial Agglomeration. 2002-5 Peter Myhre Lildholdt, Essays on Seasonality, Long Memory, and Volatility. 2002-6 Sean Hove, Three Essays on Mobility and Income Distribution Dynamics. 2002 Hanne Kargaard Thomsen, The Learning organization from a management point of
view - Theoretical perspectives and empirical findings in four Danish service organizations.
2002 Johannes Liebach Lüneborg, Technology Acquisition, Structure, and Performance in
The Nordic Banking Industry. 2003-1 Carter Bloch, Aspects of Economic Policy in Emerging Markets. 2003-2 Morten Ørregaard Nielsen, Multivariate Fractional Integration and Cointegration. 2003 Michael Knie-Andersen, Customer Relationship Management in the Financial Sector. 2004-1 Lars Stentoft, Least Squares Monte-Carlo and GARCH Methods for American
Options. 2004-2 Brian Krogh Graversen, Employment Effects of Active Labour Market Programmes:
Do the Programmes Help Welfare Benefit Recipients to Find Jobs? 2004-3 Dmitri Koulikov, Long Memory Models for Volatility and High Frequency Financial
Data Econometrics. 2004-4 René Kirkegaard, Essays on Auction Theory.
2004-5 Christian Kjær, Essays on Bargaining and the Formation of Coalitions. 2005-1 Julia Chiriaeva, Credibility of Fixed Exchange Rate Arrangements. 2005-2 Morten Spange, Fiscal Stabilization Policies and Labour Market Rigidities. 2005-3 Bjarne Brendstrup, Essays on the Empirical Analysis of Auctions. 2005-4 Lars Skipper, Essays on Estimation of Causal Relationships in the Danish Labour
Market. 2005-5 Ott Toomet, Marginalisation and Discouragement: Regional Aspects and the Impact
of Benefits. 2005-6 Marianne Simonsen, Essays on Motherhood and Female Labour Supply. 2005 Hesham Morten Gabr, Strategic Groups: The Ghosts of Yesterday when it comes to
Understanding Firm Performance within Industries? 2005 Malene Shin-Jensen, Essays on Term Structure Models, Interest Rate Derivatives and
Credit Risk. 2006-1 Peter Sandholt Jensen, Essays on Growth Empirics and Economic Development. 2006-2 Allan Sørensen, Economic Integration, Ageing and Labour Market Outcomes 2006-3 Philipp Festerling, Essays on Competition Policy 2006-4 Carina Sponholtz, Essays on Empirical Corporate Finance 2006-5 Claus Thrane-Jensen, Capital Forms and the Entrepreneur – A contingency approach
on new venture creation 2006-6 Thomas Busch, Econometric Modeling of Volatility and Price Behavior in Asset and
Derivative Markets 2007-1 Jesper Bagger, Essays on Earnings Dynamics and Job Mobility 2007-2 Niels Stender, Essays on Marketing Engineering 2007-3 Mads Peter Pilkjær Harmsen, Three Essays in Behavioral and Experimental
Economics 2007-4 Juanna Schrøter Joensen, Determinants and Consequences of Human Capital
Investments 2007-5 Peter Tind Larsen, Essays on Capital Structure and Credit Risk
2008-1 Toke Lilhauge Hjortshøj, Essays on Empirical Corporate Finance – Managerial Incentives, Information Disclosure, and Bond Covenants
2008-2 Jie Zhu, Essays on Econometric Analysis of Price and Volatility Behavior in Asset
Markets 2008-3 David Glavind Skovmand, Libor Market Models - Theory and Applications 2008-4 Martin Seneca, Aspects of Household Heterogeneity in New Keynesian Economics 2008-5 Agne Lauzadyte, Active Labour Market Policies and Labour Market Transitions in
Denmark: an Analysis of Event History Data 2009-1 Christian Dahl Winther, Strategic timing of product introduction under heterogeneous
demand 2009-2 Martin Møller Andreasen, DSGE Models and Term Structure Models with
Macroeconomic Variables 2009-3 Frank Steen Nielsen, On the estimation of fractionally integrated processes