35
Editor’s Note: The following article was the JBES Invited Address presented at the Joint Statistical Meetings, Minneapolis, Minnesota, August 7–11, 2005. Realized Variance and Market Microstructure Noise Peter R. HANSEN Department of Economics, Stanford University, 579 Serra Mall, Stanford, CA 94305-6072 ([email protected]) Asger LUNDE Department of Marketing and Statistics, Aarhus School of Business, Fuglesangs Alle 4, 8210 Aarhus V, Denmark We study market microstructure noise in high-frequency data and analyze its implications for the real- ized variance (RV) under a general specification for the noise. We show that kernel-based estimators can unearth important characteristics of market microstructure noise and that a simple kernel-based estimator dominates the RV for the estimation of integrated variance (IV). An empirical analysis of the Dow Jones Industrial Average stocks reveals that market microstructure noise is time-dependent and correlated with increments in the efficient price. This has important implications for volatility estimation based on high- frequency data. Finally, we apply cointegration techniques to decompose transaction prices and bid–ask quotes into an estimate of the efficient price and noise. This framework enables us to study the dynamic effects on transaction prices and quotes caused by changes in the efficient price. KEY WORDS: Bias correction; High-frequency data; Integrated variance; Market microstructure noise; Realized variance; Realized volatility; Sampling schemes. The great tragedy of Science—the slaying of a beautiful hypothesis by an ugly fact (Thomas H. Huxley, 1825–1895). 1. INTRODUCTION The presence of market microstructure noise in high-frequen- cy financial data complicates the estimation of financial volatil- ity and makes standard estimators, such as the realized variance (RV), unreliable. Thus, from the perspective of volatil- ity estimation, market microstructure noise is an “ugly fact” that challenges the validity of theoretical results that rely on the ab- sence of noise. Volatility estimation in the presence of market microstructure noise is currently a very active area of research. Interestingly, this literature was initiated by an article by Zhou (1996) that was published in this journal a decade ago and was in many ways 10 years ahead of its time. The best remedy for market microstructure noise depends on the properties of the noise, and the main purpose of this arti- cle is to unearth the empirical properties of market microstruc- ture noise. We use a number of kernel-based estimators that are well suited for this problem, and our empirical analysis of high- frequency stock returns reveals the following ugly facts about market microstructure noise: 1. The noise is correlated with the efficient price. 2. The noise is time-dependent. 3. The noise is quite small in the Dow Jones Industrial Av- erage (DJIA) stocks. 4. The properties of the noise have changed substantially over time. These four empirical “facts” are related to one another and have important implications for volatility estimation. The time dependence in the noise and the correlation between noise and efficient price arise naturally in some models on market mi- crostructure effects, including (a generalized version of ) the bid–ask model by Roll (1984) (see Hasbrouck 2004 for a dis- cussion) and models where agents have asymmetric informa- tion, such as those by Glosten and Milgrom (1985) and Easley and O’Hara (1987, 1992). Market microstructure noise has many sources, including the discreteness of the data (see Harris 1990, 1991) and properties of the trading mechanism (see, e.g., Black 1976; Amihud and Mendelson 1987). (For additional ref- erences to this literature, see, e.g., O’Hara 1995; Hasbrouck 2004.) The main contributions of this article are as follows: First, we characterize how the RV is affected by market microstruc- ture noise under a general specification for the noise that allows for various forms of stochastic dependencies. Second, we show that market microstructure noise is time-dependent and corre- lated with efficient returns. Third, we consider some existing theoretical results based on assumptions about the noise that are too simplistic, and discuss when such results provide reason- able approximations. For example, our empirical analysis of the 30 DJIA stocks shows that the noise may be ignored when in- traday returns are sampled at relatively low frequencies, such as 20-minute sampling. Assuming the noise is of an independent type seems to be reasonable when intraday returns are sampled every 15 ticks or so. Fourth, we apply cointegration methods to decompose transaction prices and bid/ask quotations into es- timates of the efficient price and market microstructure noise. The correlations between these estimated series are consistent with the volatility signature plots. The cointegration analysis enables us to study how a change in the efficient price dynami- cally affects bid, ask, and transaction prices. © 2006 American Statistical Association Journal of Business & Economic Statistics April 2006, Vol. 24, No. 2 DOI 10.1198/073500106000000071 127

Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

  • Upload
    ledan

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Editorrsquos Note The following article was the JBES Invited Address presented at the Joint StatisticalMeetings Minneapolis Minnesota August 7ndash11 2005

Realized Variance and MarketMicrostructure NoisePeter R HANSEN

Department of Economics Stanford University 579 Serra Mall Stanford CA 94305-6072(peterhansenstanfordedu)

Asger LUNDEDepartment of Marketing and Statistics Aarhus School of Business Fuglesangs Alle 4 8210 Aarhus V Denmark

We study market microstructure noise in high-frequency data and analyze its implications for the real-ized variance (RV) under a general specification for the noise We show that kernel-based estimators canunearth important characteristics of market microstructure noise and that a simple kernel-based estimatordominates the RV for the estimation of integrated variance (IV) An empirical analysis of the Dow JonesIndustrial Average stocks reveals that market microstructure noise is time-dependent and correlated withincrements in the efficient price This has important implications for volatility estimation based on high-frequency data Finally we apply cointegration techniques to decompose transaction prices and bidndashaskquotes into an estimate of the efficient price and noise This framework enables us to study the dynamiceffects on transaction prices and quotes caused by changes in the efficient price

KEY WORDS Bias correction High-frequency data Integrated variance Market microstructure noiseRealized variance Realized volatility Sampling schemes

The great tragedy of Sciencemdashthe slaying of a beautiful hypothesis by an uglyfact (Thomas H Huxley 1825ndash1895)

1 INTRODUCTION

The presence of market microstructure noise in high-frequen-cy financial data complicates the estimation of financial volatil-ity and makes standard estimators such as the realizedvariance (RV) unreliable Thus from the perspective of volatil-ity estimation market microstructure noise is an ldquougly factrdquo thatchallenges the validity of theoretical results that rely on the ab-sence of noise Volatility estimation in the presence of marketmicrostructure noise is currently a very active area of researchInterestingly this literature was initiated by an article by Zhou(1996) that was published in this journal a decade ago and wasin many ways 10 years ahead of its time

The best remedy for market microstructure noise depends onthe properties of the noise and the main purpose of this arti-cle is to unearth the empirical properties of market microstruc-ture noise We use a number of kernel-based estimators that arewell suited for this problem and our empirical analysis of high-frequency stock returns reveals the following ugly facts aboutmarket microstructure noise

1 The noise is correlated with the efficient price2 The noise is time-dependent3 The noise is quite small in the Dow Jones Industrial Av-

erage (DJIA) stocks4 The properties of the noise have changed substantially

over time

These four empirical ldquofactsrdquo are related to one another andhave important implications for volatility estimation The timedependence in the noise and the correlation between noise andefficient price arise naturally in some models on market mi-

crostructure effects including (a generalized version of ) thebidndashask model by Roll (1984) (see Hasbrouck 2004 for a dis-cussion) and models where agents have asymmetric informa-tion such as those by Glosten and Milgrom (1985) and Easleyand OrsquoHara (1987 1992) Market microstructure noise hasmany sources including the discreteness of the data (see Harris1990 1991) and properties of the trading mechanism (see egBlack 1976 Amihud and Mendelson 1987) (For additional ref-erences to this literature see eg OrsquoHara 1995 Hasbrouck2004)

The main contributions of this article are as follows Firstwe characterize how the RV is affected by market microstruc-ture noise under a general specification for the noise that allowsfor various forms of stochastic dependencies Second we showthat market microstructure noise is time-dependent and corre-lated with efficient returns Third we consider some existingtheoretical results based on assumptions about the noise that aretoo simplistic and discuss when such results provide reason-able approximations For example our empirical analysis of the30 DJIA stocks shows that the noise may be ignored when in-traday returns are sampled at relatively low frequencies such as20-minute sampling Assuming the noise is of an independenttype seems to be reasonable when intraday returns are sampledevery 15 ticks or so Fourth we apply cointegration methodsto decompose transaction prices and bidask quotations into es-timates of the efficient price and market microstructure noiseThe correlations between these estimated series are consistentwith the volatility signature plots The cointegration analysisenables us to study how a change in the efficient price dynami-cally affects bid ask and transaction prices

copy 2006 American Statistical AssociationJournal of Business amp Economic Statistics

April 2006 Vol 24 No 2DOI 101198073500106000000071

127

128 Journal of Business amp Economic Statistics April 2006

The interest for empirical quantities based on high-frequencydata has surged in recent years (see Barndorff-Nielsen andShephard 2007 for a recent survey) The RV is a well-knownquantity that goes back to Merton (1980) Other empiricalquantities include bipower variation and multipower varia-tion which are particularly useful for detecting jumps (seeBarndorff-Nielsen and Shephard 2003 2004 2006ab Ander-sen Bollerslev and Diebold 2003 Bollerslev Kretschmer Pig-orsch and Tauchen 2005 Huang and Tauchen 2005 Tauchenand Zhou 2004) and intraday range-based estimators (seeChristensen and Podolskij 2005) High-frequencyndashbased quan-tities have proven useful for a number of problems For ex-ample several authors have applied filtering and smoothingtechniques to time series of the RV to obtain time seriesfor daily volatility (see eg Maheu and McCurdy 2002Barndorff-Nielsen Nielsen Shephard and Ysusi 1996 Engleand Sun 2005 Frijns and Lehnert 2004 Koopman Jungbackerand Hol 2005 Hansen and Lunde 2005b Owens and Steiger-wald 2005) High-frequencyndashbased quantities are also use-ful in the context of forecasting (see Andersen Bollerslevand Meddahi 2004 Ghysels Santa-Clara and Valkanov 2006)and the evaluation and comparison of volatility models (seeAndersen and Bollerslev 1998 Hansen Lunde and Nason2003 Hansen and Lunde 2005a 2006 Patton 2005)

The RV which is a sum of squared intraday returns yields aperfect estimate of volatility in the ideal situation where pricesare observed continuously and without measurement error (seeeg Merton 1980) This result suggests that the RV should bebased on intraday returns sampled at the highest possible fre-quency (tick-by-tick data) Unfortunately the RV suffers froma well-known bias problem that tends to get worse as the sam-pling frequency of intraday returns increases (see eg Fang1996 Andreou and Ghysels 2002 Oomen 2002 Bai Russelland Tiao 2004) The source of this bias problem is known asmarket microstructure noise and the bias is particularly ev-ident in volatility signature plots (see Andersen BollerslevDiebold and Labys 2000b) Thus there is a trade-off betweenbias and variance when choosing the sampling frequency asdiscussed by Bandi and Russell (2005) and Zhang Myklandand Aiumlt-Sahalia (2005) This trade-off is the reason that theRV is often computed from intraday returns sampled at a mod-erate frequency such as 5-minute or 20-minute sampling

A key insight into the problem of estimating the volatil-ity from high-frequency data comes from its similarity to theproblem of estimating the long-run variance of a stationarytime series In this literature it is well known that autocorre-lation necessitates modifications of the usual sum-of-squaredestimator Those modifications of Newey and West (1987) andAndrews (1991) provided such estimators that are robust to au-tocorrelation Market microstructure noise induces autocorre-lation in the intraday returns and this autocorrelation is thesource of the RVrsquos bias problem Given this connection to long-run variance estimation it is not surprising that ldquoprewhiten-ingrdquo of intraday returns and kernel-based estimators (includingthe closely related subsample-based estimators) are found tobe useful in the present context Zhou (1996) introduced theuse of kernel-based estimators and the subsampling idea todeal with market microstructure noise in high-frequency dataFiltering techniques have been used by Ebens (1999) Ander-sen Bollerslev Diebold and Ebens (2001) and Maheu and

McCurdy (2002) (moving average filter) and Bollen and In-der (2002) (autoregressive filter) Kernel-based estimators wereexplored by Zhou (1996) Hansen and Lunde (2003) andBarndorff-Nielsen Hansen Lunde and Shephard (2004) andthe closely related subsample-based estimators were used in anunpublished paper by Muumlller (1993) and also by Zhou (1996)Zhang et al (2005) and Zhang (2004)

The rest of the article is organized as follows In Section 2we describe our theoretical framework and discuss samplingschemes in calendar time and tick time We also characterizethe bias of the RV under a general specification for the noiseIn Section 3 we consider the case with independent market mi-crostructure noise which has been used by various authors in-cluding Corsi Zumbach Muumlller and Dacorogna (2001) Curciand Corsi (2004) Bandi and Russell (2005) and Zhang et al(2005) We consider a simple kernel-based estimator of Zhou(1996) that we denote by RVAC1 because it uses the first-orderautocorrelation to bias-correct the RV We benchmark RVAC1

to the standard measure of RV and find that the former is su-perior to the latter in terms of the mean squared error (MSE)We also evaluate the implications for some theoretical resultsbased on assumptions in which market microstructure noise isabsent Interestingly we find that the root mean squared er-ror (RMSE) of the RV in the presence of noise is quite simi-lar to those that ignore the noise at low sampling frequenciessuch as 20-minute sampling This finding is important becausemany existing empirical studies have drawn conclusions from20-minute and 30-minute intraday returns using the results ofBarndorff-Nielsen and Shephard (2002) However at 5-minutesampling we find that the ldquotruerdquo confidence interval about theRV can be as much as 100 larger than those based on an ldquoab-sence of noise assumptionrdquo In Section 4 we present a robustestimator that is unbiased for a general type of noise and dis-cuss noise that is time-dependent in both calendar time and ticktime We also discuss the subsampling version of Zhoursquos es-timator which is robust to some forms of time-dependence intick time In Section 5 we describe our data and present mostof our empirical results The key result is the overwhelming ev-idence against the independent noise assumption This findingis quite robust to the choice of sampling method (calendar timeor tick time) and the type of price data (transaction prices orquotation prices) This dependence structure has important im-plications for many quantities based on ultra-highndashfrequencydata These features of the noise have important implicationsfor some of the bias corrections that have been used in the liter-ature Although the independent noise assumption may be fairlyreasonable when the tick size is 116 it is clearly not consistentwith the recent data In fact much of the noise has ldquoevaporatedrdquoafter the tick size is reduced to 1 cent In Section 6 we presenta cointegration analysis of the vector of bid ask and transactionprices The Granger representation makes it possible to decom-pose each of the price series into noise and a common efficientprice Further based on this decomposition we estimate impulseresponse functions that reveal the dynamic effects on bid askand transaction prices as a response to a change in the efficientprice In Section 7 we provide a summary and we concludethe article with three appendixes that provide proofs and detailsabout our estimation methods

Hansen and Lunde Realized Variance and Market Microstructure Noise 129

2 THE THEORETICAL FRAMEWORK

We let plowast(t) denote a latent log-price process in continuoustime and use p(t) to denote the observable log-price processThus the noise process is given by

u(t)equiv p(t)minus plowast(t)

The noise process u may be due to market microstructure ef-fects such as bidndashask bounces but the discrepancy betweenp and plowast can also be induced by the technique used to con-struct p(t) For example p is often constructed artificially fromobserved transactions or quotes using the previous tick methodor the linear interpolation method which we define and discusslater in this section

We work under the following specification for the efficientprice process plowast

Assumption 1 The efficient price process satisfies dplowast(t) =σ(t)dw(t) where w(t) is a standard Brownian motion σ isa random function that is independent of w and σ 2(t) isLipschitz (almost surely)

In our analysis we condition on the volatility path σ 2(t)because our analysis focuses on estimators of the integratedvariance (IV)

IV equivint b

aσ 2(t)dt

Thus we can treat σ 2(t) as deterministic even though weview the volatility path as random The Lipschitz condition isa smoothness condition that requires |σ 2(t) minus σ 2(t + δ)| lt εδ

for some ε and all t and δ (with probability 1) The assump-tion that w and σ are independent is not essential The connec-tion between kernel-based and subsample-based estimators (seeBarndorff-Nielsen et al 2004) shows that weaker assumptionsused by Zhang et al (2005) and Zhang (2004) are sufficient inthis framework

We partition the interval [ab] into m subintervals andm plays a central role in our analysis For example we deriveasymptotic distributions of quantities as m rarrinfin This type ofinfill asymptotics is commonly used in spatial data analysis andgoes back to Stein (1987) Related to the present context is theuse of infill asymptotics for estimation of diffusions (see Bandiand Phillips 2004) For a fixed m the ith subinterval is givenby [timinus1m tim] where a = t0m lt t1m lt middot middot middot lt tmm = b Thelength of the ith subinterval is given by δim equiv tim minus timinus1m andwe assume that supi=1m δim = O( 1

m ) such that the lengthof each subinterval shrinks to 0 as m increases The intradayreturns are now defined by

ylowastim equiv plowast(tim)minus plowast(timinus1m) i = 1 m

and the increments in p and u are defined similarly and denotedby

yim equiv p(tim)minus p(timinus1m) i = 1 m

and

eim equiv u(tim)minus u(timinus1m) i = 1 m

Note that the observed intraday returns decompose into yim =ylowastim + eim The IV over each of the subintervals is definedby

σ 2im equiv

int tim

timinus1m

σ 2(s)ds i = 1 m

and we note that var(ylowastim) = E(ylowast2im) = σ 2

im under Assump-tion 1

The RV of plowast is defined by

RV(m)lowast equivmsum

i=1

ylowast2im

and RV(m)lowast is consistent for the IV as m rarrinfin (see eg Protter2005) A feasible asymptotic distribution theory of RV (in rela-tion to IV) was established by Barndorff-Nielsen and Shephard(2002) (see also Meddahi 2002 Mykland and Zhang 2006Gonccedilalves and Meddahi 2005) Whereas RV(m)lowast is an ideal es-timator it is not a feasible estimator because plowast is latent Therealized variance of p given by

RV(m) equivmsum

i=1

y2im

is observable but suffers from a well-known bias problem andis generally inconsistent for the IV (see eg Bandi and Russell2005 Zhang et al 2005)

21 Sampling Schemes

Intraday returns can be constructed using different types ofsampling schemes The special case where tim i = 1 mare equidistant in calendar time [ie δim = (bminus a)m for all i]is referred to as calendar time sampling (CTS) The widely usedexchange rates data from Olsen and associates (see Muumlller et al1990) are equidistant in time and 5-minute sampling (δim =5 min) is often used in practice

CTS requires the construction of artificial prices from theraw (irregularly spaced) price data (transaction prices or quo-tations) Given observed prices at the times t0 lt middot middot middot lt tN onecan construct a price at time τ isin [tj tj+1) using

p(τ )equiv ptj

or

p(τ )equiv ptj +τ minus tj

tj+1 minus tj

(ptj+1 minus ptj

)

The former is known as the previous tick method (Wasserfallenand Zimmermann 1985) and the latter is the linear interpola-tion method (see Andersen and Bollerslev 1997) Both methodshave been discussed by Dacorogna Gencay Muumlller Olsen andPictet (2001 sec 321) When sampling at ultra-high frequen-cies the linear interpolation method has the following unfortu-

nate property where ldquoprarrrdquo denotes convergence in probability

Lemma 1 Let N be fixed and consider the RV based onthe linear interpolation method It holds that RV(m) prarr 0 asm rarrinfin

130 Journal of Business amp Economic Statistics April 2006

The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns

The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say

The case where the sampling times t0m tmm are suchthat σ 2

im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)

22 Characterizing the Bias of the Realized VarianceUnder General Noise

Initially we make the following assumptions about the noiseprocess u

Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]

The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise

An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations

Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u

Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by

E[RV(m) minus IV

]= 2ρm + 2m

[π(0)minus π

(bminus a

m

)] (1)

where ρm equiv E(summ

i=1 ylowastimeim)

The result of Theorem 1 is based on the following decompo-sition of the observed RV

RV(m) =msum

i=1

ylowast2im + 2

msumi=1

eimylowastim +msum

i=1

e2im

wheresumm

i=1 e2im is the ldquorealized variancerdquo of the noise process

u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1

The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult

Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by

limmrarrinfinE

[RV(m) minus IV

]= 2ρ minus 2(bminus a)π prime(0)

provided that ρ equiv limmrarrinfin E(summ

i=1 ylowastimeim) is well defined

Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that

RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)

where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)

A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)

t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average

RV(m) equiv nminus1nsum

t=1

RV(m)t

Hansen and Lunde Realized Variance and Market Microstructure Noise 131

Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4

The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents

an approximate 95 confidence interval for the average volatility

132 Journal of Business amp Economic Statistics April 2006

as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)

Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)

t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)

ACNW30 defined in Section 42 The shaded area about

σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B

From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)

A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise

Fact I The noise is negatively correlated with the efficientreturns

Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa

m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm

is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From

cov(ylowastim emidim )= 1

2cov(ylowastim eask

im)+ 1

2cov(ylowastim ebid

im)

we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data

Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened

A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1

2 )2 + ( 12 )2 versus 12] Such dis-

crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias

Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM

3 THE CASE WITH INDEPENDENT NOISE

In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5

We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption

Assumption 3 The noise process satisfies the following

(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 2: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

128 Journal of Business amp Economic Statistics April 2006

The interest for empirical quantities based on high-frequencydata has surged in recent years (see Barndorff-Nielsen andShephard 2007 for a recent survey) The RV is a well-knownquantity that goes back to Merton (1980) Other empiricalquantities include bipower variation and multipower varia-tion which are particularly useful for detecting jumps (seeBarndorff-Nielsen and Shephard 2003 2004 2006ab Ander-sen Bollerslev and Diebold 2003 Bollerslev Kretschmer Pig-orsch and Tauchen 2005 Huang and Tauchen 2005 Tauchenand Zhou 2004) and intraday range-based estimators (seeChristensen and Podolskij 2005) High-frequencyndashbased quan-tities have proven useful for a number of problems For ex-ample several authors have applied filtering and smoothingtechniques to time series of the RV to obtain time seriesfor daily volatility (see eg Maheu and McCurdy 2002Barndorff-Nielsen Nielsen Shephard and Ysusi 1996 Engleand Sun 2005 Frijns and Lehnert 2004 Koopman Jungbackerand Hol 2005 Hansen and Lunde 2005b Owens and Steiger-wald 2005) High-frequencyndashbased quantities are also use-ful in the context of forecasting (see Andersen Bollerslevand Meddahi 2004 Ghysels Santa-Clara and Valkanov 2006)and the evaluation and comparison of volatility models (seeAndersen and Bollerslev 1998 Hansen Lunde and Nason2003 Hansen and Lunde 2005a 2006 Patton 2005)

The RV which is a sum of squared intraday returns yields aperfect estimate of volatility in the ideal situation where pricesare observed continuously and without measurement error (seeeg Merton 1980) This result suggests that the RV should bebased on intraday returns sampled at the highest possible fre-quency (tick-by-tick data) Unfortunately the RV suffers froma well-known bias problem that tends to get worse as the sam-pling frequency of intraday returns increases (see eg Fang1996 Andreou and Ghysels 2002 Oomen 2002 Bai Russelland Tiao 2004) The source of this bias problem is known asmarket microstructure noise and the bias is particularly ev-ident in volatility signature plots (see Andersen BollerslevDiebold and Labys 2000b) Thus there is a trade-off betweenbias and variance when choosing the sampling frequency asdiscussed by Bandi and Russell (2005) and Zhang Myklandand Aiumlt-Sahalia (2005) This trade-off is the reason that theRV is often computed from intraday returns sampled at a mod-erate frequency such as 5-minute or 20-minute sampling

A key insight into the problem of estimating the volatil-ity from high-frequency data comes from its similarity to theproblem of estimating the long-run variance of a stationarytime series In this literature it is well known that autocorre-lation necessitates modifications of the usual sum-of-squaredestimator Those modifications of Newey and West (1987) andAndrews (1991) provided such estimators that are robust to au-tocorrelation Market microstructure noise induces autocorre-lation in the intraday returns and this autocorrelation is thesource of the RVrsquos bias problem Given this connection to long-run variance estimation it is not surprising that ldquoprewhiten-ingrdquo of intraday returns and kernel-based estimators (includingthe closely related subsample-based estimators) are found tobe useful in the present context Zhou (1996) introduced theuse of kernel-based estimators and the subsampling idea todeal with market microstructure noise in high-frequency dataFiltering techniques have been used by Ebens (1999) Ander-sen Bollerslev Diebold and Ebens (2001) and Maheu and

McCurdy (2002) (moving average filter) and Bollen and In-der (2002) (autoregressive filter) Kernel-based estimators wereexplored by Zhou (1996) Hansen and Lunde (2003) andBarndorff-Nielsen Hansen Lunde and Shephard (2004) andthe closely related subsample-based estimators were used in anunpublished paper by Muumlller (1993) and also by Zhou (1996)Zhang et al (2005) and Zhang (2004)

The rest of the article is organized as follows In Section 2we describe our theoretical framework and discuss samplingschemes in calendar time and tick time We also characterizethe bias of the RV under a general specification for the noiseIn Section 3 we consider the case with independent market mi-crostructure noise which has been used by various authors in-cluding Corsi Zumbach Muumlller and Dacorogna (2001) Curciand Corsi (2004) Bandi and Russell (2005) and Zhang et al(2005) We consider a simple kernel-based estimator of Zhou(1996) that we denote by RVAC1 because it uses the first-orderautocorrelation to bias-correct the RV We benchmark RVAC1

to the standard measure of RV and find that the former is su-perior to the latter in terms of the mean squared error (MSE)We also evaluate the implications for some theoretical resultsbased on assumptions in which market microstructure noise isabsent Interestingly we find that the root mean squared er-ror (RMSE) of the RV in the presence of noise is quite simi-lar to those that ignore the noise at low sampling frequenciessuch as 20-minute sampling This finding is important becausemany existing empirical studies have drawn conclusions from20-minute and 30-minute intraday returns using the results ofBarndorff-Nielsen and Shephard (2002) However at 5-minutesampling we find that the ldquotruerdquo confidence interval about theRV can be as much as 100 larger than those based on an ldquoab-sence of noise assumptionrdquo In Section 4 we present a robustestimator that is unbiased for a general type of noise and dis-cuss noise that is time-dependent in both calendar time and ticktime We also discuss the subsampling version of Zhoursquos es-timator which is robust to some forms of time-dependence intick time In Section 5 we describe our data and present mostof our empirical results The key result is the overwhelming ev-idence against the independent noise assumption This findingis quite robust to the choice of sampling method (calendar timeor tick time) and the type of price data (transaction prices orquotation prices) This dependence structure has important im-plications for many quantities based on ultra-highndashfrequencydata These features of the noise have important implicationsfor some of the bias corrections that have been used in the liter-ature Although the independent noise assumption may be fairlyreasonable when the tick size is 116 it is clearly not consistentwith the recent data In fact much of the noise has ldquoevaporatedrdquoafter the tick size is reduced to 1 cent In Section 6 we presenta cointegration analysis of the vector of bid ask and transactionprices The Granger representation makes it possible to decom-pose each of the price series into noise and a common efficientprice Further based on this decomposition we estimate impulseresponse functions that reveal the dynamic effects on bid askand transaction prices as a response to a change in the efficientprice In Section 7 we provide a summary and we concludethe article with three appendixes that provide proofs and detailsabout our estimation methods

Hansen and Lunde Realized Variance and Market Microstructure Noise 129

2 THE THEORETICAL FRAMEWORK

We let plowast(t) denote a latent log-price process in continuoustime and use p(t) to denote the observable log-price processThus the noise process is given by

u(t)equiv p(t)minus plowast(t)

The noise process u may be due to market microstructure ef-fects such as bidndashask bounces but the discrepancy betweenp and plowast can also be induced by the technique used to con-struct p(t) For example p is often constructed artificially fromobserved transactions or quotes using the previous tick methodor the linear interpolation method which we define and discusslater in this section

We work under the following specification for the efficientprice process plowast

Assumption 1 The efficient price process satisfies dplowast(t) =σ(t)dw(t) where w(t) is a standard Brownian motion σ isa random function that is independent of w and σ 2(t) isLipschitz (almost surely)

In our analysis we condition on the volatility path σ 2(t)because our analysis focuses on estimators of the integratedvariance (IV)

IV equivint b

aσ 2(t)dt

Thus we can treat σ 2(t) as deterministic even though weview the volatility path as random The Lipschitz condition isa smoothness condition that requires |σ 2(t) minus σ 2(t + δ)| lt εδ

for some ε and all t and δ (with probability 1) The assump-tion that w and σ are independent is not essential The connec-tion between kernel-based and subsample-based estimators (seeBarndorff-Nielsen et al 2004) shows that weaker assumptionsused by Zhang et al (2005) and Zhang (2004) are sufficient inthis framework

We partition the interval [ab] into m subintervals andm plays a central role in our analysis For example we deriveasymptotic distributions of quantities as m rarrinfin This type ofinfill asymptotics is commonly used in spatial data analysis andgoes back to Stein (1987) Related to the present context is theuse of infill asymptotics for estimation of diffusions (see Bandiand Phillips 2004) For a fixed m the ith subinterval is givenby [timinus1m tim] where a = t0m lt t1m lt middot middot middot lt tmm = b Thelength of the ith subinterval is given by δim equiv tim minus timinus1m andwe assume that supi=1m δim = O( 1

m ) such that the lengthof each subinterval shrinks to 0 as m increases The intradayreturns are now defined by

ylowastim equiv plowast(tim)minus plowast(timinus1m) i = 1 m

and the increments in p and u are defined similarly and denotedby

yim equiv p(tim)minus p(timinus1m) i = 1 m

and

eim equiv u(tim)minus u(timinus1m) i = 1 m

Note that the observed intraday returns decompose into yim =ylowastim + eim The IV over each of the subintervals is definedby

σ 2im equiv

int tim

timinus1m

σ 2(s)ds i = 1 m

and we note that var(ylowastim) = E(ylowast2im) = σ 2

im under Assump-tion 1

The RV of plowast is defined by

RV(m)lowast equivmsum

i=1

ylowast2im

and RV(m)lowast is consistent for the IV as m rarrinfin (see eg Protter2005) A feasible asymptotic distribution theory of RV (in rela-tion to IV) was established by Barndorff-Nielsen and Shephard(2002) (see also Meddahi 2002 Mykland and Zhang 2006Gonccedilalves and Meddahi 2005) Whereas RV(m)lowast is an ideal es-timator it is not a feasible estimator because plowast is latent Therealized variance of p given by

RV(m) equivmsum

i=1

y2im

is observable but suffers from a well-known bias problem andis generally inconsistent for the IV (see eg Bandi and Russell2005 Zhang et al 2005)

21 Sampling Schemes

Intraday returns can be constructed using different types ofsampling schemes The special case where tim i = 1 mare equidistant in calendar time [ie δim = (bminus a)m for all i]is referred to as calendar time sampling (CTS) The widely usedexchange rates data from Olsen and associates (see Muumlller et al1990) are equidistant in time and 5-minute sampling (δim =5 min) is often used in practice

CTS requires the construction of artificial prices from theraw (irregularly spaced) price data (transaction prices or quo-tations) Given observed prices at the times t0 lt middot middot middot lt tN onecan construct a price at time τ isin [tj tj+1) using

p(τ )equiv ptj

or

p(τ )equiv ptj +τ minus tj

tj+1 minus tj

(ptj+1 minus ptj

)

The former is known as the previous tick method (Wasserfallenand Zimmermann 1985) and the latter is the linear interpola-tion method (see Andersen and Bollerslev 1997) Both methodshave been discussed by Dacorogna Gencay Muumlller Olsen andPictet (2001 sec 321) When sampling at ultra-high frequen-cies the linear interpolation method has the following unfortu-

nate property where ldquoprarrrdquo denotes convergence in probability

Lemma 1 Let N be fixed and consider the RV based onthe linear interpolation method It holds that RV(m) prarr 0 asm rarrinfin

130 Journal of Business amp Economic Statistics April 2006

The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns

The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say

The case where the sampling times t0m tmm are suchthat σ 2

im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)

22 Characterizing the Bias of the Realized VarianceUnder General Noise

Initially we make the following assumptions about the noiseprocess u

Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]

The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise

An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations

Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u

Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by

E[RV(m) minus IV

]= 2ρm + 2m

[π(0)minus π

(bminus a

m

)] (1)

where ρm equiv E(summ

i=1 ylowastimeim)

The result of Theorem 1 is based on the following decompo-sition of the observed RV

RV(m) =msum

i=1

ylowast2im + 2

msumi=1

eimylowastim +msum

i=1

e2im

wheresumm

i=1 e2im is the ldquorealized variancerdquo of the noise process

u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1

The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult

Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by

limmrarrinfinE

[RV(m) minus IV

]= 2ρ minus 2(bminus a)π prime(0)

provided that ρ equiv limmrarrinfin E(summ

i=1 ylowastimeim) is well defined

Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that

RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)

where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)

A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)

t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average

RV(m) equiv nminus1nsum

t=1

RV(m)t

Hansen and Lunde Realized Variance and Market Microstructure Noise 131

Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4

The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents

an approximate 95 confidence interval for the average volatility

132 Journal of Business amp Economic Statistics April 2006

as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)

Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)

t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)

ACNW30 defined in Section 42 The shaded area about

σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B

From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)

A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise

Fact I The noise is negatively correlated with the efficientreturns

Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa

m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm

is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From

cov(ylowastim emidim )= 1

2cov(ylowastim eask

im)+ 1

2cov(ylowastim ebid

im)

we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data

Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened

A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1

2 )2 + ( 12 )2 versus 12] Such dis-

crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias

Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM

3 THE CASE WITH INDEPENDENT NOISE

In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5

We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption

Assumption 3 The noise process satisfies the following

(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 3: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 129

2 THE THEORETICAL FRAMEWORK

We let plowast(t) denote a latent log-price process in continuoustime and use p(t) to denote the observable log-price processThus the noise process is given by

u(t)equiv p(t)minus plowast(t)

The noise process u may be due to market microstructure ef-fects such as bidndashask bounces but the discrepancy betweenp and plowast can also be induced by the technique used to con-struct p(t) For example p is often constructed artificially fromobserved transactions or quotes using the previous tick methodor the linear interpolation method which we define and discusslater in this section

We work under the following specification for the efficientprice process plowast

Assumption 1 The efficient price process satisfies dplowast(t) =σ(t)dw(t) where w(t) is a standard Brownian motion σ isa random function that is independent of w and σ 2(t) isLipschitz (almost surely)

In our analysis we condition on the volatility path σ 2(t)because our analysis focuses on estimators of the integratedvariance (IV)

IV equivint b

aσ 2(t)dt

Thus we can treat σ 2(t) as deterministic even though weview the volatility path as random The Lipschitz condition isa smoothness condition that requires |σ 2(t) minus σ 2(t + δ)| lt εδ

for some ε and all t and δ (with probability 1) The assump-tion that w and σ are independent is not essential The connec-tion between kernel-based and subsample-based estimators (seeBarndorff-Nielsen et al 2004) shows that weaker assumptionsused by Zhang et al (2005) and Zhang (2004) are sufficient inthis framework

We partition the interval [ab] into m subintervals andm plays a central role in our analysis For example we deriveasymptotic distributions of quantities as m rarrinfin This type ofinfill asymptotics is commonly used in spatial data analysis andgoes back to Stein (1987) Related to the present context is theuse of infill asymptotics for estimation of diffusions (see Bandiand Phillips 2004) For a fixed m the ith subinterval is givenby [timinus1m tim] where a = t0m lt t1m lt middot middot middot lt tmm = b Thelength of the ith subinterval is given by δim equiv tim minus timinus1m andwe assume that supi=1m δim = O( 1

m ) such that the lengthof each subinterval shrinks to 0 as m increases The intradayreturns are now defined by

ylowastim equiv plowast(tim)minus plowast(timinus1m) i = 1 m

and the increments in p and u are defined similarly and denotedby

yim equiv p(tim)minus p(timinus1m) i = 1 m

and

eim equiv u(tim)minus u(timinus1m) i = 1 m

Note that the observed intraday returns decompose into yim =ylowastim + eim The IV over each of the subintervals is definedby

σ 2im equiv

int tim

timinus1m

σ 2(s)ds i = 1 m

and we note that var(ylowastim) = E(ylowast2im) = σ 2

im under Assump-tion 1

The RV of plowast is defined by

RV(m)lowast equivmsum

i=1

ylowast2im

and RV(m)lowast is consistent for the IV as m rarrinfin (see eg Protter2005) A feasible asymptotic distribution theory of RV (in rela-tion to IV) was established by Barndorff-Nielsen and Shephard(2002) (see also Meddahi 2002 Mykland and Zhang 2006Gonccedilalves and Meddahi 2005) Whereas RV(m)lowast is an ideal es-timator it is not a feasible estimator because plowast is latent Therealized variance of p given by

RV(m) equivmsum

i=1

y2im

is observable but suffers from a well-known bias problem andis generally inconsistent for the IV (see eg Bandi and Russell2005 Zhang et al 2005)

21 Sampling Schemes

Intraday returns can be constructed using different types ofsampling schemes The special case where tim i = 1 mare equidistant in calendar time [ie δim = (bminus a)m for all i]is referred to as calendar time sampling (CTS) The widely usedexchange rates data from Olsen and associates (see Muumlller et al1990) are equidistant in time and 5-minute sampling (δim =5 min) is often used in practice

CTS requires the construction of artificial prices from theraw (irregularly spaced) price data (transaction prices or quo-tations) Given observed prices at the times t0 lt middot middot middot lt tN onecan construct a price at time τ isin [tj tj+1) using

p(τ )equiv ptj

or

p(τ )equiv ptj +τ minus tj

tj+1 minus tj

(ptj+1 minus ptj

)

The former is known as the previous tick method (Wasserfallenand Zimmermann 1985) and the latter is the linear interpola-tion method (see Andersen and Bollerslev 1997) Both methodshave been discussed by Dacorogna Gencay Muumlller Olsen andPictet (2001 sec 321) When sampling at ultra-high frequen-cies the linear interpolation method has the following unfortu-

nate property where ldquoprarrrdquo denotes convergence in probability

Lemma 1 Let N be fixed and consider the RV based onthe linear interpolation method It holds that RV(m) prarr 0 asm rarrinfin

130 Journal of Business amp Economic Statistics April 2006

The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns

The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say

The case where the sampling times t0m tmm are suchthat σ 2

im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)

22 Characterizing the Bias of the Realized VarianceUnder General Noise

Initially we make the following assumptions about the noiseprocess u

Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]

The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise

An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations

Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u

Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by

E[RV(m) minus IV

]= 2ρm + 2m

[π(0)minus π

(bminus a

m

)] (1)

where ρm equiv E(summ

i=1 ylowastimeim)

The result of Theorem 1 is based on the following decompo-sition of the observed RV

RV(m) =msum

i=1

ylowast2im + 2

msumi=1

eimylowastim +msum

i=1

e2im

wheresumm

i=1 e2im is the ldquorealized variancerdquo of the noise process

u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1

The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult

Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by

limmrarrinfinE

[RV(m) minus IV

]= 2ρ minus 2(bminus a)π prime(0)

provided that ρ equiv limmrarrinfin E(summ

i=1 ylowastimeim) is well defined

Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that

RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)

where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)

A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)

t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average

RV(m) equiv nminus1nsum

t=1

RV(m)t

Hansen and Lunde Realized Variance and Market Microstructure Noise 131

Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4

The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents

an approximate 95 confidence interval for the average volatility

132 Journal of Business amp Economic Statistics April 2006

as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)

Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)

t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)

ACNW30 defined in Section 42 The shaded area about

σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B

From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)

A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise

Fact I The noise is negatively correlated with the efficientreturns

Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa

m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm

is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From

cov(ylowastim emidim )= 1

2cov(ylowastim eask

im)+ 1

2cov(ylowastim ebid

im)

we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data

Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened

A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1

2 )2 + ( 12 )2 versus 12] Such dis-

crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias

Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM

3 THE CASE WITH INDEPENDENT NOISE

In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5

We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption

Assumption 3 The noise process satisfies the following

(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 4: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

130 Journal of Business amp Economic Statistics April 2006

The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns

The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say

The case where the sampling times t0m tmm are suchthat σ 2

im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)

22 Characterizing the Bias of the Realized VarianceUnder General Noise

Initially we make the following assumptions about the noiseprocess u

Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]

The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise

An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations

Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u

Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by

E[RV(m) minus IV

]= 2ρm + 2m

[π(0)minus π

(bminus a

m

)] (1)

where ρm equiv E(summ

i=1 ylowastimeim)

The result of Theorem 1 is based on the following decompo-sition of the observed RV

RV(m) =msum

i=1

ylowast2im + 2

msumi=1

eimylowastim +msum

i=1

e2im

wheresumm

i=1 e2im is the ldquorealized variancerdquo of the noise process

u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1

The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult

Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by

limmrarrinfinE

[RV(m) minus IV

]= 2ρ minus 2(bminus a)π prime(0)

provided that ρ equiv limmrarrinfin E(summ

i=1 ylowastimeim) is well defined

Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that

RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)

where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)

A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)

t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average

RV(m) equiv nminus1nsum

t=1

RV(m)t

Hansen and Lunde Realized Variance and Market Microstructure Noise 131

Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4

The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents

an approximate 95 confidence interval for the average volatility

132 Journal of Business amp Economic Statistics April 2006

as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)

Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)

t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)

ACNW30 defined in Section 42 The shaded area about

σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B

From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)

A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise

Fact I The noise is negatively correlated with the efficientreturns

Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa

m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm

is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From

cov(ylowastim emidim )= 1

2cov(ylowastim eask

im)+ 1

2cov(ylowastim ebid

im)

we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data

Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened

A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1

2 )2 + ( 12 )2 versus 12] Such dis-

crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias

Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM

3 THE CASE WITH INDEPENDENT NOISE

In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5

We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption

Assumption 3 The noise process satisfies the following

(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 5: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 131

Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4

The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents

an approximate 95 confidence interval for the average volatility

132 Journal of Business amp Economic Statistics April 2006

as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)

Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)

t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)

ACNW30 defined in Section 42 The shaded area about

σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B

From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)

A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise

Fact I The noise is negatively correlated with the efficientreturns

Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa

m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm

is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From

cov(ylowastim emidim )= 1

2cov(ylowastim eask

im)+ 1

2cov(ylowastim ebid

im)

we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data

Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened

A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1

2 )2 + ( 12 )2 versus 12] Such dis-

crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias

Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM

3 THE CASE WITH INDEPENDENT NOISE

In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5

We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption

Assumption 3 The noise process satisfies the following

(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 6: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

132 Journal of Business amp Economic Statistics April 2006

as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)

Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)

t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)

ACNW30 defined in Section 42 The shaded area about

σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B

From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)

A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise

Fact I The noise is negatively correlated with the efficientreturns

Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa

m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm

is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From

cov(ylowastim emidim )= 1

2cov(ylowastim eask

im)+ 1

2cov(ylowastim ebid

im)

we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data

Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened

A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1

2 )2 + ( 12 )2 versus 12] Such dis-

crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias

Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM

3 THE CASE WITH INDEPENDENT NOISE

In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5

We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption

Assumption 3 The noise process satisfies the following

(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 7: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 133

Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA

(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t

The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes

referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 8: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

134 Journal of Business amp Economic Statistics April 2006

Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1

The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)

Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen

var(RV(m)

)= κ12ω4m+ 8ω2msum

i=1

σ 2im

minus (6κ minus 2)ω4 + 2msum

i=1

σ 4im (2)

and

RV(m) minus 2mω2

radicκ12ω4m

=radic

m

(RV(m)

2mω2minus 1

)drarr N(01)

as m rarrinfin

Here we ldquodrarrrdquo to denote convergence in distribution Thus

unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)

is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim

In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that

var(RV(m)

)= 2msum

i=1

σ 4im = 2

bminus a

m

int b

aσ 4(s)ds+ o

(1

m

)

whereint b

a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)

Next we consider the estimator of Zhou (1996) given by

RV(m)AC1

equivmsum

i=1

y2im +

msumi=1

yimyiminus1m +msum

i=1

yimyi+1m (3)

This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in

much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1

i=2 y2im +

summi=2 yimyiminus1m +summminus1

i=1 yimyi+1m that estimatesint bminusδmma+δ1m

σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)

and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5

Next we formulate results for RV(m)AC1

that are similar to thosefor RV(m) in Lemma 2

Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)

AC1)= IV If Assumption 3(b) also holds then

var(RV(m)

AC1

)= 8ω4m+ 8ω2msum

i=1

σ 2im

minus 6ω4 + 6msum

i=1

σ 4im +O(mminus2)

under CTS and BTS and

RV(m)AC1

minus IVradic8ω4m

drarr N(01) as m rarrinfin

An important result of Lemma 3 is that RV(m)AC1

is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)

AC1can be rewrit-

ten in a way that does not involve squared noise terms u2im

i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)

AC1 has a smaller asymptotic vari-

ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4

vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)

AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges

in distribution to N(01) and simply write RV(m)AC1

radic

8ω4mdrarr

N(01) In other words whereas RV(m)AC1

is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 9: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 135

In the absence of market microstructure noise (ω2 = 0) wenote that

var[RV(m)

AC1

]asymp 6msum

i=1

σ 4im

which shows that the variance of RV(m)AC1

is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)

It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem

minτ1τm

msumi=1

τ 2i subject to

msumi=1

τi = c

Thus if we set τi = σ 2im and c = IV then we see that

summi=1 σ 4

im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS

summi=1 σ 4

im = IV2mand specifically it holds that

IV2

mle

int b

aσ 4(s)ds

bminus a

m

The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by

var[RV(m)

AC1

]asymp 8ω4m+ 8ω2int b

aσ 2(s)ds

minus 6ω4 + 6bminus a

m

int b

aσ 4(s)ds

Next we compare RV(m)AC1

and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators

Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2

im = IVm (BTS) The MSEs aregiven by

MSE(RV(m)

)= IV2[

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21

m

](4)

and

MSE(RV(m)

AC1

)= IV2[

8λ2m+ 8λminus 6λ2 + 61

mminus2

1

m2

] (5)

The optimal sampling frequencies for RV(m) and RV(m)AC1

are

given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0

We denote the optimal sampling frequencies for RV(m) andRV(m)

AC1by mlowast

0 and mlowast1 These are approximately given by

mlowast0 asymp (2λ)minus23 and mlowast

1 asympradic

3(2λ)minus1

The expression for mlowast0 was derived in Bandi and Russell (2005)

and Zhang et al (2005) under more general conditions than

those used in Corollary 2 whereas the expression for mlowast1 was

derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such

that

mlowast1mlowast

0 asymp 3122minus13(λminus1)13 ge 10

which shows that mlowast1 is several times larger than mlowast

0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)

AC1permits more frequent sampling than does

the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1

canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast

1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies

Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)

AC1in terms

of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)

AC1and for analyzing their (lack of )

robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and

RV(m)AC1

are proportional to the IV and given by r0(λm)IV andr1(λm)IV where

r0(λm)equivradic

4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2

m

and

r1(λm)equivradic

8λ2m+ 8λminus 6λ2 + 6

mminus 2

m2

Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)

AC1dominates the RV(m) except at

the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast

0 and mlowast1 For the AA returns we find that the opti-

mal sampling frequencies are mlowast0AA = 44 and mlowast

1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast

0 and mlowast1 show that RV(m)

AC1is less sensitive than

RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of

RV(m)AC1

to that of (the optimal) RV(mlowast0) and the relative RMSE of

RV(m) to that of (the optimal) RV(mlowast

1)

AC1 These panels show that

the RV(m)AC1

continues to dominate the ldquooptimalrdquo RV(mlowast0) for a

wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast

1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast

1 are not known withcertainty The result shows that a reasonably precise estimate

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 10: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

136 Journal of Business amp Economic Statistics April 2006

Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-

per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise

r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1

to RV (mlowast0 ) and RV

(mlowast1)

AC1 as defined by r0(λ m)r1(λ mlowast

1) and

r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure

noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 11: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 137

of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV

This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)

A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)

AC1 using both

λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)

and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)

AC1) The ver-

tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)

AC1are 941 and 307 Thus a ldquono-noiserdquo confidence

interval about RV(m)AC1

is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size

Figure 4 presents the volatility signature plots for RV(m)AC1

where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)

AC1when intraday returns are sampled more

frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)

AC1at the highest frequencies which shows

that the noise is time dependent in tick time For example the

MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer

Fact II The noise is autocorrelated

We provide additional evidence of this fact based on otherempirical quantities in the following sections

4 THE CASE WITH DEPENDENT NOISE

In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time

41 Dependence in Calender Time

To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process

Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0

The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t

tminusθ0ψ(t minus s)dB(s) where B(s) rep-

resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0

s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]

Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)

E(RV(m)

ACqmminus IV

)= 0

where

RV(m)ACqm

equivmsum

i=1

y2im +

qmsumh=1

msumi=1

(yiminushmyim + yimyi+hm)

A drawback of RV(m)ACqm

is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 12: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

138 Journal of Business amp Economic Statistics April 2006

Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction

Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV

(1 tick)ACNW30

that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 13: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 139

ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator

In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)

ACmore comparable across different frequencies m Thus we setqm = w

(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)

ACwin place of

RV(m)ACqm

Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms

When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)

ACqmcannot be consistent for IV This property is common

for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2

im) = 2σ 4im and

var(yimyi+hm)= σ 2imσ 2

i+hm asymp σ 4im such that

var[RV(m)

ACqm

] asymp 2msum

i=1

σ 4im +

qmsumh=1

(2)2msum

i=1

σ 4im

= 2(1+ 2qm)

msumi=1

σ 4im

which approximately equals

2(1+ 2qm)bminus a

m

int 1

0σ 4(s)ds

under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0

The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)

ACq Here we sample intraday returns

every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)

ACqof the four price series differ and have not leveled off

is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short

as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time

42 Time Dependence in Tick Time

When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)

The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns

Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that

ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1

such that

E(e2i )= α2(σ 2

i + σ 2iminus1)+ 2ω2

and

E(eiylowasti )= ασ 2

i

wheresumm

i=1 σ 2i = IV Thus

E[RV(1 tick)

]= IV + 2α(1+ α)IV + 2mω2

with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have

E(eieiminus1)=minusα2σ 2iminus1 minusω2

and

E(eiylowastiminus1)=minusασ 2

iminus1

such thatmsum

i=1

E[yiyi+1] = minusα2IV minus 2mω2 minus αIV

which shows that RV(1 tick)AC1

is almost unbiased for the IV

In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)

ACq with a q sufficiently large to capture the

time dependenceAssumption 4 and Theorem 2 are formulated for the case

with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 14: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

140 Journal of Business amp Economic Statistics April 2006

Figure 5 Volatility Signature Plots of RV (1 sec)ACq

(four upper panels) and RV (1 tick)ACq

(four lower panels) for Each of the Price Series Ask

Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms

q included in RV (1 sec)ACq

The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those

for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30

that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 15: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 141

signature plots for RV(1 tick)ACq

where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)

ACqagainst q for MSFT in the year 2000

In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)

ACq robust

to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)

ACq

this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator

Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns

ytiti+k equiv pti+k minus pti

Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity

RV(k tick)AC1

=sum

iisin0k2kNminusk

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

)

(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)

AC1 proposed by

Zhou (1996) can be expressed as

1

k

Nminus1sumi=0

(y2

titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k

) (6)

Thus for k = 2 we have

1

2

Nsumi=1

(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)

+ (yi + yi+1)(yi+2 + yi+3)

where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by

1

2

Nsumi=1

2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3

= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1

2(γminus3 + γ3)

where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can

show that this subsample estimator is approximately given by

RV(1 tick)ACNWk

equiv γ0 +ksum

j=1

(γminusj + γj)+ksum

j=1

k minus j

k(γminusjminusk + γj+k)

Thus this estimator equals the RV(1 tick)ACk

plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k

N )+O( 1k )+O( N

k2 ) (assuming constant volatility) This term

is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)

5 EMPIRICAL ANALYSIS

We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database

We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns

Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 16: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

142 Journal of Business amp Economic Statistics April 2006

Tabl

e1

Equ

ityD

ata

Sum

mar

yS

tatis

tics

Tran

sact

ions

day

Quo

tes

day

No

ofda

ysS

ymbo

lN

ame

Exc

hang

e20

0020

0120

0220

0320

0420

0020

0120

0220

0320

04

AA

Alc

oaN

YS

E99

7 (41

)1

479 (

50)

189

8 (50

)2

153 (

45)

315

0 (43

)1

384 (

33)

187

5 (44

)2

775 (

38)

530

9 (32

)7

560 (

34)

122

3A

XP

Am

eric

anE

xpre

ssN

YS

E1

584 (

45)

244

9 (54

)2

791 (

53)

337

1 (52

)2

752 (

44)

200

4 (42

)3

123 (

46)

374

5 (41

)7

293 (

43)

746

4 (34

)1

221

BA

Boe

ing

Com

pany

NY

SE

105

2 (39

)1

730 (

49)

230

6 (52

)2

961 (

48)

303

7 (47

)1

587 (

30)

249

2 (46

)3

552 (

43)

647

5 (42

)8

022 (

39)

122

1C

Citi

grou

pN

YS

E2

597 (

44)

312

9 (51

)3

997 (

49)

442

4 (46

)4

803 (

47)

307

6 (27

)3

994 (

42)

503

9 (40

)7

993 (

37)

931

6 (37

)1

221

CAT

Cat

erpi

llar

Inc

NY

SE

747 (

40)

126

0 (50

)1

673 (

53)

241

4 (54

)3

154 (

56)

103

9 (33

)2

002 (

43)

287

9 (40

)5

852 (

46)

818

5 (51

)1

221

DD

Du

Pon

tDe

Nem

ours

NY

SE

125

7 (48

)1

853 (

54)

210

0 (53

)2

963 (

51)

307

7 (49

)1

858 (

30)

291

5 (43

)3

418 (

39)

666

3 (41

)8

069 (

38)

122

0D

ISW

altD

isne

yN

YS

E1

304 (

39)

201

3 (53

)2

882 (

54)

344

8 (48

)3

564 (

46)

152

5 (25

)2

893 (

43)

475

0 (36

)7

768 (

33)

848

0 (34

)1

221

EK

Eas

tman

Kod

akN

YS

E77

5 (42

)1

128 (

50)

139

2 (45

)2

008 (

45)

194

1 (44

)1

181 (

35)

194

1 (44

)2

322 (

37)

491

0 (35

)5

715 (

31)

122

0G

EG

ener

alE

lect

ricN

YS

E3

191 (

41)

315

7 (52

)4

712 (

54)

482

8 (48

)4

771 (

44)

288

8 (34

)3

646 (

48)

555

9 (42

)8

369 (

31)

100

51(2

4)1

221

GM

Gen

eral

Mot

ors

NY

SE

988 (

37)

135

7 (50

)2

448 (

49)

287

4 (49

)2

855 (

47)

157

2 (35

)2

009 (

44)

424

6 (37

)6

262 (

35)

688

9 (34

)1

221

HD

Hom

eD

epot

Inc

NY

SE

196

1 (42

)2

648 (

51)

353

3 (52

)3

843 (

49)

364

2 (46

)2

161 (

31)

294

6 (51

)4

181 (

42)

724

6 (36

)8

005 (

31)

122

1H

ON

Hon

eyw

ell

NY

SE

112

2 (38

)1

453 (

52)

187

2 (47

)2

482 (

47)

266

8 (47

)1

378 (

33)

236

4 (47

)3

120 (

40)

538

5 (40

)6

542 (

39)

122

1H

PQ

Hew

lettndash

Pac

kard

NY

SE

190

0 (46

)2

321 (

51)

254

3 (46

)3

263 (

43)

370

8 (43

)2

394 (

45)

317

0 (43

)3

226 (

36)

653

6 (31

)8

700 (

27)

121

8IB

MIn

tB

usin

ess

Mac

hine

sN

YS

E2

322 (

55)

363

7 (63

)3

492 (

61)

411

1 (60

)4

553 (

55)

331

9 (53

)4

729 (

55)

668

8 (37

)7

790 (

49)

944

7 (51

)1

220

INT

CIn

tel

Cor

pN

AS

D14

982

(73)

156

37(8

3)15

194

(80)

109

18(7

1)9

078 (

67)

105

99(1

7)12

512

(32)

176

22(4

5)19

554

(39)

198

28(4

2)1

230

IPIn

tern

atio

nalP

aper

NY

SE

100

2 (40

)1

509 (

50)

197

1 (50

)2

569 (

47)

228

3 (44

)1

368 (

31)

234

6 (41

)3

134 (

37)

599

7 (40

)6

417 (

34)

122

1JN

JJo

hnso

namp

John

son

NY

SE

149

2 (49

)2

059 (

57)

254

1 (60

)3

272 (

53)

385

3 (48

)1

739 (

36)

232

2 (47

)3

091 (

42)

652

9 (39

)8

687 (

36)

122

1JP

MJ

PM

orga

nN

YS

E1

317 (

52)

255

5 (51

)2

973 (

52)

383

6 (49

)3

939 (

46)

183

9 (52

)3

384 (

46)

407

9 (39

)7

387 (

36)

880

8 (30

)1

221

KO

Coc

a-C

ola

NY

SE

137

6 (45

)1

662 (

51)

234

9 (52

)2

854 (

50)

332

0 (48

)1

635 (

32)

206

3 (48

)3

221 (

42)

624

0 (39

)7

498 (

35)

122

1M

CD

McD

onal

dsN

YS

E1

109 (

39)

177

9 (50

)2

226 (

49)

263

8 (45

)2

790 (

43)

157

8 (21

)2

334 (

42)

306

5 (37

)5

763 (

33)

733

1 (31

)1

221

MM

MM

inne

sota

Mng

Mfg

N

YS

E86

8 (44

)1

563 (

56)

205

4 (57

)3

092 (

58)

337

0 (48

)1

157 (

45)

246

2 (50

)3

253 (

48)

710

0 (52

)8

228 (

46)

122

0M

OP

hilip

Mor

risN

YS

E1

283 (

38)

194

2 (53

)3

047 (

54)

347

3 (50

)3

404 (

48)

236

5 (13

)3

266 (

34)

417

4 (40

)6

776 (

38)

772

1 (38

)1

220

MR

KM

erck

NY

SE

183

2 (41

)1

943 (

51)

257

4 (52

)3

639 (

50)

366

3 (45

)2

021 (

35)

238

1 (48

)3

262 (

41)

692

7 (43

)7

658 (

34)

122

1M

SF

TM

icro

soft

NA

SD

139

00(6

8)14

479

(86)

152

57(8

5)12

013

(73)

864

3 (66

)8

275 (

14)

133

41(4

0)18

234

(66)

198

33(4

5)19

669

(41)

122

9P

GP

roct

eramp

Gam

ble

NY

SE

151

8 (44

)1

881 (

54)

263

1 (52

)3

314 (

52)

366

8 (50

)2

584 (

27)

313

7 (43

)4

555 (

42)

753

5 (47

)8

397 (

45)

122

1S

BC

Sbc

Com

mun

icat

ions

NY

SE

160

3 (37

)2

267 (

52)

303

8 (52

)3

060 (

46)

336

4 (43

)2

042 (

24)

279

1 (48

)3

909 (

39)

647

8 (31

)8

346 (

26)

122

0T

ATamp

TC

orp

NY

SE

223

3 (29

)1

851 (

40)

222

4 (43

)2

353 (

44)

241

2 (39

)1

657 (

26)

223

8 (40

)2

931 (

32)

530

7 (30

)7

091 (

20)

122

1U

TX

Uni

ted

Tech

nolo

gies

NY

SE

767 (

41)

135

4 (51

)1

986 (

53)

275

8 (55

)3

107 (

55)

111

1 (46

)2

183 (

46)

307

5 (45

)6

657 (

49)

799

6 (53

)1

220

WM

TW

al-M

artS

tore

sN

YS

E1

946 (

43)

236

8 (54

)2

847 (

58)

286

0 (51

)4

394 (

50)

260

9 (27

)2

675 (

47)

397

1 (44

)5

610 (

39)

874

4 (40

)1

221

XO

ME

xxon

Mob

ilN

YS

E1

720 (

41)

222

7 (53

)3

467 (

54)

419

8 (48

)4

489 (

47)

197

9 (33

)2

712 (

43)

467

7 (37

)8

340 (

32)

995

0 (29

)1

219

NO

TE

S

ymbo

lsan

dna

mes

ofth

eeq

uitie

sus

edin

our

empi

rical

anal

ysis

The

third

colu

mn

isth

eex

chan

geth

atw

eex

trac

ted

data

from

and

the

follo

win

gco

lum

nsgi

veth

eav

erag

enu

mbe

rof

tran

sact

ions

quo

tatio

nspe

rda

yfo

rea

chof

the

five

year

sin

our

sam

ple

The

perc

enta

geof

obse

rvat

ions

for

whi

chth

etr

ansa

ctio

npr

ice

oron

eof

the

quot

edpr

ices

was

diffe

rent

from

the

prev

ious

one

isgi

ven

inpa

rent

hese

sT

hela

stco

lum

nis

the

tota

lnum

ber

ofda

tain

our

sam

ple

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 17: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 143

results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request

51 Empirical Implementation of Estimators

In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)

γh equiv m

mminus h

mminushsumi=1

yimyi+hm

for the theoretical quantity

γh equivmsum

i=1

yimyi+hm

because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)

AC1=summ

i=1 y2im + 2 m

mminus1

summminus1i=1 yimyi+1m

52 Estimation of Market MicrostructureNoise Parameters

Under the independent noise assumption used in Section 3we have from Lemma 2 that

ω2 equiv RV(m)

2m

prarr ω2 as m rarrinfin

This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2

t to determine the optimal sampling fre-quency mlowast

0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator

Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2

will overestimate ω2 Thus better estimators are given by

ω2 equiv RV(m) minus RV(13)

2(mminus 13)

and

ω2 equiv RV(m) minus IV

2m

where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need

not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators

Table 2 presents annual sample averages of ω2 ω2 and ω2

for the 5 years of our sample Here we use RV(1 tick)AC1

as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)

Fact III The noise is smaller than one might think

Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption

The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV

Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)

are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7

Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ

λequiv ω2IV

where ω2 equiv nminus1 sumnt=1 ω2

t and IV equiv nminus1 sumnt=1 RV(1 tick)

AC1t The

latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ

whenever a law of large numbers applies to ω2t and RV(1 tick)

AC1t

t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio

Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast

0) minus

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 18: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

144 Journal of Business amp Economic Statistics April 2006

Tabl

e2

Est

imat

esof

the

Noi

seV

aria

nce

for

Tran

sact

ion

Dat

a(a

nnua

lave

rage

s)

2000

2001

2002

2003

2004

Ass

etω

2

AA

178

69

951

040

447

098

117

409

150

100

201

040

055

090

011

024

AX

P6

181

892

612

710

540

562

570

750

600

840

230

230

330

060

10B

A1

174

610

681

292

026

045

324

118

069

162

070

044

060

010

014

C6

334

074

382

220

850

282

560

350

300

620

160

140

340

160

13C

AT1

672

724

834

263

minus03

20

282

07minus

025

029

080

minus00

40

080

410

010

03D

D1

130

662

676

231

050

058

228

068

048

087

035

024

053

020

016

DIS

163

81

179

120

55

632

731

945

393

442

582

311

511

131

190

800

62E

K9

122

654

133

82minus

084

020

361

minus03

10

371

640

100

381

24minus

003

028

GE

541

390

361

196

062

031

186

084

050

089

052

049

051

031

033

GM

639

018

225

222

minus01

40

361

78minus

001

043

092

013

025

052

001

014

HD

971

592

613

256

058

054

245

091

083

114

050

053

058

017

024

HO

N1

374

458

599

537

089

090

451

079

080

217

088

058

098

030

024

HP

Q8

212

322

467

163

201

937

113

502

542

481

081

011

420

890

78IB

M3

421

341

491

120

310

211

180

290

200

400

130

090

240

110

06IN

TC

232

186

208

209

171

186

077

042

054

055

034

039

046

030

032

IP2

015

104

91

156

431

167

092

234

068

051

117

040

026

067

007

016

JNJ

373

196

212

132

048

049

161

066

065

067

018

021

029

008

011

JPM

426

minus00

40

213

681

870

975

231

941

281

250

560

520

440

160

19K

O7

764

354

981

880

350

351

460

460

330

730

260

190

440

200

16M

CD

196

61

517

146

63

361

721

173

511

741

262

461

201

150

990

440

46M

MM

492

minus03

30

711

90minus

007

minus00

21

190

140

100

320

040

030

300

010

02M

O2

891

237

62

487

167

028

044

130

060

038

097

028

025

043

002

011

MR

K4

992

422

881

590

190

151

570

290

230

560

090

110

500

130

19M

SF

T2

251

942

060

730

520

600

250

070

130

360

250

270

370

290

29P

G6

303

283

151

850

720

450

850

150

120

310

090

040

270

070

06S

BC

992

596

662

199

031

049

361

131

087

176

064

061

094

052

052

T2

135

170

41

756

467

157

138

544

198

222

227

058

086

203

103

111

UT

X9

77minus

049

120

246

minus04

4minus

006

189

minus00

30

170

640

050

060

380

080

05W

MT

892

462

545

184

029

030

131

025

025

054

002

009

032

008

012

XO

M3

481

482

051

430

460

311

520

840

370

630

370

250

310

120

15

NO

TE

A

nnua

lave

rage

sof

estim

ates

ofω

2fo

rtr

ansa

ctio

nda

taus

ing

the

thre

edi

ffere

ntes

timat

ors

ω2equiv

RV

(m)

(2m

2equiv

(RV

(m)minus

RV

(13)

)(2

(mminus

13))

and

ω2equiv

(RV

(m)minus

IV)

(2m

)A

llnu

mbe

rsha

vebe

enm

ultip

lied

by10

0

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 19: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 145

Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 20: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

146 Journal of Business amp Economic Statistics April 2006

Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 21: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 147

Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000

Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast

0 mlowast1 ∆RMSE ω2 middot 100

AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259

NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)

AC1based on the listed value of λ The corresponding reduction in the RMSE

100[r 0(λmlowast0 ) minus r 1(λmlowast

1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using

mid-quotes where negative values are indicative of negative correlation between noise and efficient returns

r1(λmlowast1)]r0(λmlowast

0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2

AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast

0 = 44 and mlowast1 = 511 For a typical trading day spanning

65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast

0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-

duction of the RMSE (from using RV(mlowast

1)

AC1rather than RV(mlowast

0))to be 33

The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast

0 and mlowast1

will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)

AC1is relatively insensitive

to small deviations from mlowast1 such that a reasonable estimate

for λ does produce a more accurate estimator based on RVAC1

than one based on RVFor the quotation data almost all of our estimates of

ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)

AC1 This is obviously in

conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)

AC1] be positive

The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2

(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption

The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes

The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 22: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

148 Journal of Business amp Economic Statistics April 2006

in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)

AC1is

often very different from RV(1 tick)AC30

For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes

Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact

Fact IV The properties of the noise have changed over time

Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)

6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS

Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice

One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis

The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study

the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)

We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price

We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by

pti = transaction price at time ti

prevailing ask price at time tiprevailing bid price at time ti

where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can

be approximated by the vector autoregressive error correctionmodel

pti = αβ primeptiminus1 +kminus1sumj=1

jptiminusj +micro+ εti (7)

where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors

β = (β1β2)=

1 0minus 1

2 1

minus 12 minus1

(8)

The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime

1pti isthe difference between the transaction price and the mid-quoteand β prime

2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which

are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations

βperp = ι and αprimeperpι= 1

where ιequiv (111)prime

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 23: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 149

It follows from the Granger representation theorem that

pt = βperp(αprimeperpβperp)minus1αprimeperptsum

s=1

εs +C(L)(εt +micro)+A0

where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis

61 Common Stochastic Trend

There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is

plowastti equiv (αprimeperpβperp)minus1isum

j=1

αprimeperpεtj + initial value

This definition has the desired martingale property because thecorresponding intraday returns

ylowastti equivαprimeperpεti

αprimeperpβperp

are uncorrelated Thus the Granger representation can be ex-pressed as

pti =

plowasttiplowasttiplowastti

+C(L)εti +A0

This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)

It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby

RVplowast equivmsum

i=1

(ylowastti)2 =

summi=1(α

primeperpεti)2

(αprimeperpβperp)2

Naturally the parameters need to be estimated in practice sowe rely on

plowastti = (αprimeperpβperp)minus1

isumj=1

αprimeperpεtj

where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define

RVplowast equivmsum

i=1

(ylowasttj

)2 =summ

i=1(αprimeperpεti)

2

(αprimeperpβperp)2

62 Empirical Results

We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C

Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation

A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation

αprimeperp = (αperptr αperpask αperpbid)

Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-

parison we also report the average RVquantities RV(1 tick)

ACNW15

RV(1 tick)

ACNW30 and RV

(13) To get a sense of the dispersion of the

daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages

It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1

2 14 1

4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1

4 38 3

8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE

The estimated model allows us to compute the daily esti-mates

msumi=1

e2ti and

msumi=1

ylowastti eti

where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 24: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

150 Journal of Business amp Economic Statistics April 2006

Table 4 Cointegration Results for the Year 2004

Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15

RV (1 tick)ACNW 30

RV (13)

AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]

AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]

BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]

C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]

CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]

DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]

DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]

EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]

GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]

GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]

HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]

HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]

HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]

IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]

INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]

IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]

JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]

JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]

KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]

MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]

MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]

MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]

MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]

MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]

PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]

SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]

T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]

UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]

WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]

XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]

NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)

ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)

ACNWqequiv

1nsumn

t = 1 (RV (1 tick) trACNWqt

+RV (1 tick) bidACNWqt

+RV (1 tick) askACNWqt

)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics

not affect eti ) Table 5 reports daily averages for the differentprice series

Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so

there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices

63 Impulse Response Function

Define the two matrices

= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 25: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 151

Tabl

e5

Var

iatio

nsan

dC

ovar

iatio

nfo

rN

oise

and

Effi

cien

tPric

efo

rth

eYe

ar20

04

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

AA

230

79

147

89

173

minus30

minus75

minus81

minus86

[10

04

61]

[39

158

][6

62

92]

[31

210

][7

73

41]

[minus1

01

13]

[minus2

13minus

05]

[minus2

05minus

09]

[minus2

30minus

12]

AX

P6

73

14

12

04

6minus

08minus

16minus

17minus

19[2

91

33]

[17

54]

[19

76]

[07

41]

[21

89]

[minus2

80

4][minus

43

00]

[minus4

5minus

01]

[minus5

0minus

01]

BA

146

63

92

46

110

minus17

minus35

minus40

minus45

[63

284

][2

81

19]

[42

180

][1

89

6][5

12

26]

[minus7

00

9][minus

93

minus03

][minus

100

minus

08]

[minus1

20minus

05]

C1

014

97

33

57

80

4minus

20minus

22minus

23[4

32

06]

[26

72]

[33

133

][1

27

4][3

71

57]

[minus1

91

7][minus

57

minus01

][minus

62

minus03

][minus

66

minus02

]C

AT1

616

39

44

51

12minus

36minus

37minus

42minus

46[7

23

08]

[27

116

][3

51

92]

[15

84]

[45

218

][minus

88

minus07

][minus

86

minus03

][minus

93

minus08

][minus

104

minus

05]

DD

112

57

81

34

87

minus04

minus22

minus22

minus22

[52

206

][3

49

2][3

71

54]

[14

74]

[42

157

][minus

28

11]

[minus6

40

1][minus

63

minus02

][minus

61

minus01

]D

IS1

681

651

576

21

663

5minus

19minus

23minus

26[7

03

07]

[88

270

][6

13

11]

[23

121

][7

03

13]

[07

74]

[minus6

61

0][minus

69

minus01

][minus

82

04]

EK

230

89

128

74

152

minus50

minus67

minus73

minus79

[79

472

][3

81

72]

[51

277

][2

11

80]

[56

342

][minus

166

03

][minus

196

minus

02]

[minus2

05minus

04]

[minus2

18minus

03]

GE

87

87

80

40

83

22

minus24

minus25

minus26

[39

180

][5

21

28]

[33

155

][1

18

3][3

81

52]

[10

38]

[minus6

20

0][minus

66

minus03

][minus

68

minus02

]G

M1

284

57

53

98

1minus

15minus

35minus

37minus

39[5

42

36]

[23

80]

[32

152

][1

39

1][3

81

52]

[minus5

30

8][minus

96

minus04

][minus

94

minus06

][minus

103

minus

07]

HD

137

70

93

48

98

minus04

minus38

minus39

minus40

[60

267

][3

61

28]

[38

178

][1

71

01]

[43

203

][minus

33

15]

[minus9

5minus

03]

[minus9

5minus

05]

[minus1

00minus

02]

HO

N2

038

51

416

71

59minus

17minus

45minus

49minus

53[8

53

94]

[43

143

][6

12

86]

[21

147

][6

93

07]

[minus6

91

9][minus

145

02

][minus

142

minus

04]

[minus1

52

00]

HP

Q2

072

021

697

51

792

7minus

33minus

36minus

40[8

04

07]

[12

82

96]

[79

317

][2

71

42]

[81

310

][minus

03

59]

[minus1

28

08]

[minus1

11

04]

[minus1

26

07]

IBM

90

40

63

26

69

minus03

minus19

minus22

minus25

[36

177

][1

76

7][2

61

23]

[09

54]

[31

129

][minus

23

10]

[minus5

1minus

01]

[minus5

9minus

03]

[minus6

9minus

04]

INT

C2

363

558

26

68

0minus

13minus

83minus

83minus

82[1

07

421

][2

08

524

][3

41

43]

[23

125

][3

41

35]

[minus9

64

5][minus

160

minus

30]

[minus1

59minus

30]

[minus1

60minus

29]

IP1

195

17

93

58

5minus

14minus

26minus

28minus

31[4

62

50]

[21

107

][3

01

65]

[11

70]

[34

170

][minus

47

09]

[minus7

60

2][minus

73

minus01

][minus

77

minus01

]

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 26: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

152 Journal of Business amp Economic Statistics April 2006

Tabl

e5

(con

tinue

d)

RV

plowastsum ie

2 itr

sum ie2 i

ask

sum ie2 i

mid

sum ie2 i

bid

sum ieiylowast i

trsum ie

iylowast i

ask

sum ieiylowast i

mid

sum ieiylowast i

bid

JNJ

81

40

50

27

57

minus06

minus21

minus23

minus24

[33

166

][2

56

3][2

29

8][0

95

3][2

69

9][minus

24

05]

[minus5

5minus

01]

[minus5

7minus

03]

[minus5

6minus

02]

JPM

108

60

74

36

82

0minus

23minus

23minus

24[4

62

43]

[33

98]

[31

164

][1

19

2][3

71

71]

[minus2

41

3][minus

79

01]

[minus7

7minus

01]

[minus8

30

0]K

O9

55

36

32

76

3minus

02minus

20minus

20minus

20[4

21

85]

[31

87]

[32

113

][0

95

3][3

11

11]

[minus2

61

3][minus

59

00]

[minus5

8minus

01]

[minus5

60

0]M

CD

139

94

105

46

113

04

minus26

minus29

minus31

[60

314

][5

41

49]

[46

209

][1

51

23]

[52

220

][minus

30

26]

[minus1

16

05]

[minus1

07

00]

[minus1

07

06]

MM

M1

093

25

62

75

9minus

20minus

28minus

30minus

32[4

32

11]

[14

62]

[23

111

][1

05

9][2

61

14]

[minus5

2minus

01]

[minus7

1minus

05]

[minus7

5minus

08]

[minus7

7minus

07]

MO

148

49

84

48

89

minus23

minus48

minus49

minus49

[34

317

][2

08

1][2

71

90]

[10

116

][2

81

85]

[minus7

30

7][minus

131

minus

01]

[minus1

24minus

02]

[minus1

28

00]

MR

K1

306

19

04

79

7minus

06minus

35minus

37minus

40[4

23

84]

[21

152

][3

02

50]

[12

140

][3

12

36]

[minus4

31

8][minus

136

00

][minus

140

minus

02]

[minus1

42minus

01]

MS

FT

116

279

41

33

41

08

minus37

minus37

minus37

[50

219

][1

97

395

][1

77

7][1

36

3][1

77

9][minus

22

26]

[minus8

1minus

11]

[minus8

2minus

12]

[minus8

4minus

13]

PG

87

32

58

29

67

minus10

minus22

minus24

minus27

[34

164

][1

35

9][2

31

10]

[10

60]

[31

127

][minus

30

04]

[minus5

2minus

02]

[minus5

2minus

04]

[minus6

4minus

04]

SB

C1

361

249

64

31

020

9minus

26minus

27minus

27[5

62

64]

[74

174

][3

61

77]

[10

95]

[41

199

][minus

27

34]

[minus9

60

5][minus

91

00]

[minus9

10

3]T

165

194

129

50

142

10

minus21

minus23

minus25

[62

332

][1

05

314

][5

72

39]

[15

109

][5

82

74]

[minus2

63

8][minus

84

15]

[minus8

40

8][minus

101

13

]U

TX

119

46

76

33

84

minus16

minus24

minus26

minus29

[45

247

][1

79

0][3

11

56]

[11

66]

[35

158

][minus

52

04]

[minus6

8minus

01]

[minus6

5minus

04]

[minus7

7minus

02]

WM

T1

113

77

34

38

1minus

04minus

33minus

36minus

38[5

32

22]

[18

67]

[38

132

][2

08

3][4

11

43]

[minus3

31

0][minus

74

minus08

][minus

81

minus11

][minus

83

minus11

]X

OM

78

45

55

28

58

05

minus20

minus20

minus21

[34

160

][2

66

9][2

21

11]

[08

63]

[23

120

][minus

09

18]

[minus5

4minus

01]

[minus6

0minus

02]

[minus6

2minus

02]

NO

TE

A

nnua

lave

rage

sof

the

real

ized

varia

nce

ofef

ficie

ntpr

ice

and

nois

epr

oces

ses

corr

espo

ndin

gto

diffe

rent

pric

ese

ries

The

effic

ient

pric

ean

dth

eno

ise

proc

esse

sw

ere

dedu

ced

from

daily

estim

ates

ofa

coin

tegr

atio

nve

ctor

auto

regr

essi

onm

odel

T

henu

m-

bers

inth

esq

uare

dbr

acke

tsar

eth

e5

and

95

quan

tiles

for

the

diffe

rent

stat

istic

s

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 27: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 153

Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices

As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation

Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1

(9)

where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 28: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

154 Journal of Business amp Economic Statistics April 2006

Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that

dpti+h

dplowastti= dpti+h

dεtitimes dεti

d(αprimeperpεti)times d(αprimeperpεti)

dplowastti

= (C+Ch)timesαperp1

αprimeperpαperptimes αprimeperpβperp

= δ(C+Ch)αperp = ι+ δChαperp

where

δ equiv αprimeperpβperpαprimeperpαperp

Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)

Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ

7 SUMMARY AND CONCLUDING REMARKS

We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context

More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling

We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of

Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)

AC1yields a more accurate approximation than

that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies

Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices

Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)

AC1to the

standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)

AC1 Among the estimators

that we have analyzed in this article RV(1 tick)ACNW30

appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)

ACNW10

may be a better estimator than RV(1 tick)ACNW30

although the formeris more likely to be biased

Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 29: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 155

Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles

noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-

ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 30: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

156 Journal of Business amp Economic Statistics April 2006

the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators

ACKNOWLEDGMENTS

The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged

APPENDIX A PROOFS

As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations

Proof of Lemma 1

First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that

msumi=1

y2im le

msumi=1

δ2(bminus a)2mminus2 = δ2(bminus a)2

m

which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1

Proof of Theorem 1

From the identities RV(m) = summi=1[ylowast2

im + 2ylowastimeim + e2im]

and E(summ

i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2

im) = 2ρm + mE(e2im) The result of

the theorem now follows from the identity

E(e2im)= E

[u(iδm)minus u((iminus 1)δm)

]2 = 2[π(0)minus π(δm)]given Assumption 2

Proof of Corollary 1

Because m = (bminus a)δm we have that

limmrarrinfinm[π(0)minus π(δm)] = lim

mrarrinfin(bminus a)π(0)minus π(δm)

δm

= minus(bminus a)π prime(0)

under the assumption that π prime(0) is well defined

Proof of Lemma 2

The bias follows directly from the decomposition y2im =

ylowast2im + e2

im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =

E(u2im) + E(u2

iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that

var(RV(m)

)= var

(msum

i=1

ylowast2im

)+ var

(msum

i=1

e2im

)

+ 4 var

(msum

i=1

ylowastimeim

)

because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(

summi=1 ylowast2

im)=summi=1 var(ylowast2

im)=2summ

i=1 σ 4im where the last equality follows from the Gaussian

assumption For the second sum we find that

E(e4im) = E(uim minus uiminus1m)4 = E(u2

im + u2iminus1m minus 2uimuiminus1m)2

= E(u4im + u4

iminus1m + 4u2imu2

iminus1m + 2u2imu2

iminus1m)+ 0

= 2micro4 + 6ω4

and

E(e2ime2

i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2

= E(u2im + u2

iminus1m minus 2uimuiminus1m)

times (u2i+1m + u2

im minus 2ui+1muim)

= E(u2im + u2

iminus1m)(u2i+1m + u2

im)+ 0

= micro4 + 3ω4

where we have used Assumption 3(a)ndash(c) Thus var(e2im) =

2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2

im e2i+1m) =

micro4 minus ω4 Because cov(e2im e2

i+hm) = 0 for |h| ge 2 it followsthat

var

(msum

i=1

e2im

)=

msumi=1

var(e2im)+

msumij=1i =j

cov(e2im e2

jm)

= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)

= 4mmicro4 minus 2(micro4 minusω4)

The last sum involves uncorrelated terms such that

var

(msum

i=1

eimylowastim

)=

msumi=1

var(eimylowastim)= 2ω2msum

i=1

σ 2im

By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)

AC1in our proof of Lemma 3 and that 2

summi=1 σ 4

im +2ω2 summ

i=1 σ 2im minus 4ω4 = O(1)

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 31: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 157

Proof of Lemma 3

First note that RV(m)AC1

= summi=1 Yim + Uim + Vim + Wim

where

Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)

Vim equiv ylowastim(ui+1m minus uiminus2m)

and

Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)

because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)

AC1are given from those

of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2

im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)

AC1] =summ

i=1 σ 2im Note that

E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)

AC1is given

by

var[RV(m)

AC1

] = var

[msum

i=1

Yim +Uim + Vim +Wim

]

= (1)+ (2)+ (3)+ (4)+ (5)

Because all other sums are uncorrelated the five parts aregiven by (1) = var(

summi=1 Yim) (2) = var(

summi=1 Uim) (3) =

var(summ

i=1 Vim) (4) = var(summ

i=1 Wim) and (5) = 2 timescov(

summi=1 Vim

summi=1 Wim) We derive the expressions of these

five terms as follow

1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-

tion 1 it follows that E[ylowast2imylowast2

jm] = σ 2imσ 2

jm for i = j and

E[ylowast2imylowast2

jm] = E[ylowast4im] = 3σ 4

im for i = j such that

var(Yim) = 3σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m minus [σ 2

im]2

= 2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m

The first-order autocorrelation of Yim is

E[YimYi+1m]= E

[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)

times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]

= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0

= 2E[ylowast2imylowast2

i+1m] = 2σ 2imσ 2

i+1m

such that cov(YimYi+1m) = σ 2imσ 2

i+1m whereas cov(Yim

Yi+hm)= 0 for |h| ge 2 Thus

(1) =msum

i=1

(2σ 4im + σ 2

imσ 2iminus1m + σ 2

imσ 2i+1m)

+msum

i=2

σ 2imσ 2

iminus1m +mminus1sumi=1

σ 2imσ 2

i+1m

= 2msum

i=1

σ 4im + 2

msumi=1

σ 2imσ 2

iminus1m + 2msum

i=1

σ 2imσ 2

i+1m

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m

= 6msum

i=1

σ 4im minus 2

msumi=1

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2msum

i=1

σ 2im(σ 2

i+1m minus σ 2im)minus σ 2

1mσ 20m minus σ 2

mmσ 2m+1m

= 6msum

i=1

σ 4im minus 2

msumi=2

σ 2im(σ 2

im minus σ 2iminus1m)

+ 2mminus1sumi=1

σ 2im(σ 2

i+1m minus σ 2im)

minus σ 21mσ 2

0m minus σ 2mmσ 2

m+1m minus 2σ 21m(σ 2

1m minus σ 20m)

+ 2σ 2mm(σ 2

m+1m minus σ 2mm)

= 6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2

im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by

E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+1m minus uim)(ui+2m minus uiminus1m)]

= E[uiminus1mui+1mui+1muiminus1m] + 0

= ω4

and

E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)

times (ui+2m minus ui+1m)(ui+3m minus uim)]

= E[uimui+1mui+1muim] + 0

= ω4

whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4

3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2

im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(

summi=1 Vim)= 2ω2 summ

i=1 σ 2im

4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that

E(W2im)= 2ω2(σ 2

iminus1m +σ 2im +σ 2

i+1m) The first-order autoco-variance equals

cov(WimWi+1m) = E[minusu2im(ylowast2

im + ylowast2i+1m)]

= minusω2(σ 2im + σ 2

i+1m)

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 32: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

158 Journal of Business amp Economic Statistics April 2006

whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus

(4) =msum

i=1

[2ω2(σ 2

iminus1m + σ 2im + σ 2

i+1m)

minusmsum

i=2

ω2(σ 2im + σ 2

iminus1m)minusmminus1sumi=1

ω2(σ 2im + σ 2

i+1m)

]

= ω2msum

i=1

(σ 2iminus1m + σ 2

i+1m)

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im +ω2[σ 2

0m minus σ 2mm + σ 2

m+1m minus σ 21m]

+ω2[σ 21m + σ 2

0m + σ 2mm + σ 2

m+1m]

= 2ω2msum

i=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

5 The autocovariances between the last two terms are givenby

E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)

times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]

showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-

variances are 0 From this we conclude that

(5) = 2

[2

msumi=1

ω2σ 2im minusω2(σ 2

1m + σ 2mm)

]

= 4ω2msum

i=1

σ 2im minus 2ω2(σ 2

1m + σ 2mm)

Adding up the five terms we find that

6msum

i=1

σ 4im minus 2

mminus1sumi=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m + 8ω4mminus 6ω4

+ 2ω2msum

i=1

σ 2im + 2ω2

msumi=1

σ 2im + 2ω2[σ 2

0m + σ 2m+1m]

+ 4ω2msum

i=1

σ 2im minus 2ω2[σ 2

1m + σ 2mm]

= 8ω4m+ 8ω2msum

i=1

σ 2im minus 6ω4 + 6

msumi=1

σ 4im + Rm

where

Rm equiv minus2msum

i=1

(σ 2i+1m minus σ 2

im)2 minus 2(σ 41m + σ 4

mm)

+ σ 21mσ 2

0m + σ 2mmσ 2

m+1m

+ 2ω2(σ 20m minus σ 2

1m + σ 2m+1m minus σ 2

mm)

Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2

im| = | int timtiminus1m

σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand

|σ 2im minus σ 2

iminus1m| =∣∣∣∣int tim

timinus1m

σ 2(s)minus σ 2(sminus δ)ds

∣∣∣∣

leint tim

timinus1m

|σ 2(s)minus σ 2(sminus δ)|ds

le δ suptiminus1mlesletim

|σ 2(s)minus σ 2(sminus δ)|

le δ2ε = O(mminus2)

Thussumm

i=1(σ2i+1m minus σ 2

im)2 le m middot (δ εm )2 = O(mminus3) which

proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)

AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim

uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2

im + ξ(1)iminus1m + ξ

(2)im + ξ

(3)i+1m where

ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m

ξ(2)im equiv ylowastimylowastim minus σ 2

im + ylowastimylowastiminus1m minus ylowastimuiminus2m

+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim

and

ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m

+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m

Thus if we define ξim equiv (ξ(1)im + ξ

(2)im + ξ

(3)im) (and use the con-

ventions ξ(2)0m = ξ

(3)0m = ξ

(3)1m = ξ

(1)mm = ξ

(1)m+1m = ξ

(2)m+1m = 0)

then it follows that [RV(m)AC1

minus IV] = summ+1i=0 ξim where ξim

Fim m+1i=0 is a martingale difference sequence that is squared-

integrable because

E(ξ2im)=

ω2σ 20 +ω4 ltinfin for i = 0

2ω4 + σ 20 σ 2

1 +ω2σ 20 + 2ω2σ 2

1 + σ 41 ltinfin

for i = 1

2σ 4im + 4σ 2

imσ 2iminus1m + 4σ 2

imω2

+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m

σ 4m + 4σ 2

mσ 2mminus1 + 6ω2σ 2

m + 4ω2σ 2mminus1 + 5ω4 ltinfin

for i = m

σ 2mσ 2

m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin

for i = m+ 1

Because mminus12[RV(m)AC1

minus IV] = mminus12 summ+1i=0 ξim we can ap-

ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 33: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 159

dition

m+1sumi=0

E[mminus1ξ2

im1|mminus12ξim|gtε∣∣Fiminus1m

] prarr 0 as m rarrinfin

Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2

im] lt infinand supi P(1|ξim|gtε

radicm = 0) rarr 0 for all ε gt 0 it follows

that

E

∣∣∣∣∣mminus1m+1sumi=0

ξ2im1|mminus12ξim|gtε minus 0

∣∣∣∣∣

le mminus1m+1sumi=0

∣∣E[ξ2

im1|mminus12ξim|gtε]minus 0

∣∣

rarr 0 as m rarrinfin

The Lindeberg condition now follows because convergencein L1 implies convergence in probability

Proof of Corollary 2

The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by

Rm = 0minus 2

(IV2

m2+ IV2

m2

)+ IV2

m2+ IV2

m2+ 0 =minus2IV2

m2

Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)

AC1)partm prop 4λ2 minus

3mminus2 + 2mminus3

Proof of Theorem 2

Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that

E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)

= [π(hδm)minus π((h+ 1)δm)

]minus [

π((hminus 1)δm)minus π(hδm)]

such thatqmsum

h=1

E(eimei+hm)

= [π(qmδm)minus π((qm + 1)δm)

]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore

0 = E[ylowastim

(ui+qmm minus uiminusqmminus1m

)]= E

[ylowastim

(ei+qmm + middot middot middot + eiminusqmm

)]and similarly

0 = E[eim

(ylowasti+qmm + middot middot middot + ylowastiminusqmm

)]

which implies that

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[ylowastimei+hm + ylowastimeiminushm]

and

ρm =msum

i=1

E[ylowastimeim] = minusmsum

i=1

qmsumh=1

E[eimylowasti+hm + eimylowastiminushm]

For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that

E

[ qmsumh=1

msumi=1

yimyiminushm + yimyi+hm

]

=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)

ACqmis unbiased

APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS

In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance

σ 2 equiv nminus1nsum

t=1

IVt

which may be estimated by

σ 2 = nminus1nsum

t=1

IVt

where we use RV(1 tick)ACNW30t

equiv (RV(1 tick) trACNW30t

+ RV(1 tick) bidACNW30t

+RV(1 tick) ask

ACNW30t)3 as our choice for IVt because it is likely to

be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)

Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ

where ξ = nminus1 sumnt=1 log IVt σ 2

ξequiv var(ξ ) and c1minusα2 is the ap-

propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that

P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P

[exp(l+ω2)le σ 2 le exp(u+ω2)

]

such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 34: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

160 Journal of Business amp Economic Statistics April 2006

and use the NeweyndashWest estimator

ω2 equiv 1

nminus 1

nsumt=1

η2t + 2

qsumh=1

(1minus h

q+ 1

)1

nminus h

nminushsumt=1

ηtηt+h

where q = int[4(n100)29]

and subsequently set σ 2ξequiv ω2n An approximate confidence

interval for σ 2 is now given by

CI(σ 2)equiv exp(log(σ 2

)plusmn σξc1minusα2

)

which we recenter about log(σ 2) (rather than ξ+ ω2) because

this ensures that the sample average of IVt is inside the confi-

dence interval σ 2 isin CI(σ 2)

APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION

To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)

prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ

Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated

1 Regress pti on (pprimetiminus1β ιpprimetiminus1

pprimetiminusk+1)prime by least

squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-

ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)

3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1

pprimetiminusk+1)prime by

least squares and obtain (α 1 kminus1)

Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times

(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times

αprimeι] = 1ς[αprime

ιminus αprimeι] = 0 and ιprimeαperp = 1

ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1

In the impulse response analysis we use the estimator equiv1m

summi=1(εti minus ε)(εti minus ε)prime

[Received February 2004 Revised September 2005]

REFERENCES

Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416

(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research

Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553

Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158

(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905

Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania

Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76

Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179

(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-

rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation

of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators

Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376

Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858

Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter

Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness

Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321

Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University

Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business

Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen

Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235

Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280

(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265

(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48

(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30

(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252

(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press

Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-

kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-

namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics

Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum

Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123

Page 35: Realized Variance and Market Microstructure Noiseecon.duke.edu/~get/browse/courses/201/spr10/DOWNLOADS/... · Realized Variance and Market Microstructure Noise Peter R. H ANSEN

Hansen and Lunde Realized Variance and Market Microstructure Noise 161

Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland

Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress

de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327

Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90

(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605

Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics

Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business

Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement

French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29

Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics

Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95

Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100

Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal

Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36

Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38

Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press

Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003

(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889

(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544

(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121

Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861

Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579

Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308

Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306

(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415

Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198

(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339

(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business

Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499

Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris

Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307

Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946

Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254

(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press

Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714

Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475

Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege

Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276

Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681

Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508

Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361

Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group

Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208

Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming

Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708

OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-

rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School

(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577

(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237

Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara

Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics

Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag

Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139

Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-

ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial

Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-

change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-

sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy

Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics

Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411

Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52

(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123