MIDAS Predicting Volatility at Different Frequencies327053/FULLTEXT01.pdf · MIDAS Predicting Volatility at Different Frequencies ... rPP −− = −. The ... section 2 introduce

MIDAS Predicting Volatility at Different

Frequencies

Wensi Shi Supervisor: Lars Forsberg

Uppsala University

2010‐5‐18

Abstract

Keyword: Realized volatility; MIDAS regression, realized power, absolute return, intra‐day data

I compared various MIDAS (mixed data sampling) regression models to predict volatility from one week to one month with different regressors based on the records of Chinese Shanghai composite index. The main regressors are in 2 types, one is the realized power (involving 5‐min absolute returns), the other is the quadratic variation, computed by squared returns. And realized power performs best at all the forecast horizons. I also compare the effect of lag numbers in regression, form 1 to 200, and it doesn’t change much after 50. In 3 week and month predict horizons, the fitness result with different lag numbers has a waving type among all the regressors, that implies there exists a seasonal effect which is the same as predict horizons in the lagged variables. At last,the out‐of ‐sample and in‐sample result of RV and RAV are quite similar, but in sometimes, out‐of sample performs better.

2

1 Introduction

The study of forecast future volatility is started by Engle’s (1982) ARCH‐class of models, which

is successfully capture the return variance using simple parametric model. The ARCH/GARCH

models of Engle and Bollerslev cast future variance by past squared returns, and alternatively,

other researchers try to find variables, other than squared returns, related to future volatility

useful in forecast.

Ding et.al (1993) suggested absolute returns might be better capture low‐frequency

components of volatility than squared returns. Daily ranges are also good predictor in Alizadeh et

al. (2002) and Gallant et al. (1999)’s suggestion. And Andersen and Bollerslev (1998) focus on

using data‐driven models of realized volatility computed by high frequency intra‐day data. And

mixed data sampling (MIDAS) introduced by Ghysels, Santa‐Clara and Valkanov (2002a,b) provide

a method to find out the best predictor among these variables at different frequency and

forecast horizons.

Generally, MIDAS model is a robust, simply and parsimonious framework of forecasting future

realized volatility at different horizons based on sample in different frequency. My work is using

the intra‐day data to predict the volatility on daily, weekly, and monthly horizons, because these

horizons are those frequencies used mostly in option pricing, portfolio managements and

hedging applications. Through the study of Chinese stock, I can use the measurement of volatility

computed by intra‐days data to predict the volatilities, that my way of studying 5‐min data’s

performance in MIDAS regression. I compared the results with different regressors RV (realized

volatility) and RAV (realized absolute volatility) both of them are computed by high‐frequency

return as a daily realized volatility. From the study of Eric Ghysels and Pedro Santa‐Clara (2003),

realized power(also said realized volatility) has a dominate part to other variables; and proposed

by Ole E. Barndorff‐Nielsen and Neil Shephard (2001), the absolute volatility also has a good

predictable effect.

To fix the notation, I use , 1 1log( ) log( )t t t tr P P− −= − to express the daily return between

time t and time t ‐1, which tP is the price of time t . In a higher frequency, m‐times in a day,

like 5‐min intra‐day data, I use the notation to express the interval return:

1 1,log( ) log( )tt t t

m m

r P P− −

= − .

The definition of a sequence of returns for a fixed day t with j numbers in the series is:

( 1)( ) log( ) log( ), 1, 2,...,j j jr t P P j mδ δ−= − = .

And RV (realized volatility) is defined as: ( ) 2, 1

1( )

mm

t t jj

Q r t−=

=∑ , in day t. RAV (realized absolute

variation) is ( ), 1

1( )

mm

t t jj

P r t−=

=∑ . The notations of RV and RAV are proposed by Barndorff‐Nielsen

and Shephard (2001).

I’m interested in the intra‐day high frequency data and use them to predict daily volatility. And

3

all the study is based on Chinese Shanghai composite index , from 26 July 1999 to 26 April 2010,

over 10 years data. Chinese Shanghai composite index is an index includes all the shares (A shares

and B shares) traded in Shanghai Stock Exchange. The base period is 19th December, 1990, and

the base values are the total market capitalization of constituents at that day and applied at 15th

July, 1991.1

I compare different regressors, the absolute return and square return; intra‐day return data

and daily volatility, at different forecast horizons, from day to month. The result I gain from these

regressions shows the absolute return outperformed with others, even better than daily power

realized volatility computed by absolute returns. And despite the weight parameters shows there

aren’t particular important lag variables, which have big weigh among all the lag factors, the

result of the MIDAS regression are also reasonable and good fit with absolute return used. The

section 2 introduce the MIDAS regression models, and the brief descriptive statistical result about

the variables RV, RAV are shown in section 3, section 4 is the results of all the regressions by

different regressors and frequency.

2 The MIDAS model

The notation of MIDAS regression model with daily regressors is:

max

( ) ( ), , 1

0( , )

kHm m

t H t H H H t k t k Htk

V b k Xμ φ θ ε+ − − −=

= + +∑

In this paper, the index t refers to daily sampling; H is the future horizon. For instance, when

calculate the week volatility, we should use H =5; in month volatility calculation, H =20.

Because there is 5 trading days in each week, and 20 trading days in a month for there is 4 weeks

in each month. Through all the different horizons,m is the intra‐day time. In my study, for high

frequency data sampled, there are 48 data records in a trading day totally, from the beginning of

market at 9:00am to the end 15:00pm, and a 90‐min break in the noon. At the left side, ( ),

Hmt H tV + is

the future focused volatility, which represents the weekly or monthly volatility. In my work, I use

2 different ways to measure the daily volatility, which are realized volatility, the quadratic

variation of return, and the absolute volatility, the absolute value of returns. The index Hm of

( ),

Hmt H tV + means there are Hm records to be summed in calculating the future volatility. The

realized volatility is calculated as: 2( ) ( 1) ,( ) ( 2)

1

[ ]Hm

t H j m t H j mj

r + − − + − −=∑ . From the notation above, the

return are computed as1 1,

log( ) log( )tt t tm m

r P P− −

= − .Because of the transformation, the forecasting

can yields better in‐ and out‐of sampled variance result for the less weight on extreme data and

previous papers, including Andersen et al.(2003) say so.

On the right side, ( ), 1

mt k t kX − − − is a volatility measure, which is calculated in two types, in

1 The detailed description are in http://en.wikipedia.org/wiki/SSE_Composite_Index

4

quadratic and absolute way. But ( ), 1

mt k t kX − − − and ( )

,Hm

t H tV + can be calculated in different frequency.

There is a weight function ( , )Hb k θ before ( ), 1

mt k t kX − − − . ( , )Hb k θ has two properties, one is

normalized to add up to one, insure the estimation of the scale parameter Hφ , the other is

non‐negative, which guarantees a non‐negative volatility process. Also there are other ways to

specify ( , )Hb k θ like “exponential Almon lag” in Ghysels et al.(2004). In my work, ( , )Hb k θ is

specified on Beta function parameterized by k and θ in my work. k is the number of lags of

( ), 1

mt k t kX − − − ,and θ is the parameters to scale weight , which has two parameters shown as

1 2[ ; ]θ θ θ= . The whole function ( , )Hb k θ is shown as below, which product k weight values

paired with ( ), 1

mt k t kX − − − :

Where1 1(1 )( , , )

( , )

a bz zf z a ba bβ

− −−= , and ( , )a bβ is based on Gamma function, that means,

( ) ( )( , )( )a ba ba b

β Γ Γ=Γ +

. The specification of ( , )Hb k θ was introduced in Ghysels et al.(2002b,2004)

and has several useful characteristics. 1) It provides positive coefficients, makes all the

forecasting variables ( ), 1

mt k t kX − − − have a positive weight value. 2) When 1 1θ = and 2θ >1, the

series of weight values has a slowly decaying pattern, which let the nearer volatility value has a

heavier weight in the regression. 3) When 1 1θ = and 2θ =1, it products an equal weight series,

means every forecasting variables ( ), 1

mt k t kX − − − has the same weight in MIDAS regression. Figure 1

shows the trend of Beta function with different parameters 2θ and fixed 1 1θ =.

max

1 2max

1 2max1

, ;( ; )

, ;H

k

j

kfkb k

jfk

θ θθ

θ θ=

⎛ ⎞⎜ ⎟⎝ ⎠=

⎛ ⎞⎜ ⎟⎝ ⎠

∑

5

In the whole paper, I focus on predicting( )

,Hm

t H tV + , the future realized volatility from one day

(H=1), one week (H=5) to one month (H=20), because the week‐to‐month horizons matter mostly

for option pricing and portfolio management. To compare the effect of high‐frequency data and

daily volatility, there are two different ways to get the regressors. One way is to use 5‐min data

directly as the regressed variables. And at different horizons of predicted future volatility, k is

the number of lagged regressors. The other way is calculating daily volatility summed by 5‐min

data in one day, and then combining the daily volatility with weight parameters get

max

( ), 1

0( , )

km

H t k t kk

b k Xθ − − −=∑

used to predict future volatility on different horizons. In my work, all the

predicted future items ( )

,Hm

t H tV + are the sum of (future) squared returns, namely( )

,Hm

t H tQ + 。

And there are 2 different daily regressors. One is the past squared returns,( ), 1m

t tQ − , which is the

usual regressors in autoregressive conditional volatility and advocated by Andersen et

al.(2001,2002,2003). The other way is the sum of high‐frequency absolute returns, also called as

“realized absolute power” variation. Defined as:

( ), 1 ( 1) , ( 2)

1

mm

t t t j m t j mj

P r− − − − −=

= ∑

Realized power variation is suggested by Barndorff‐Nielsen and Shephard(2003b,2004)

and Woerner (2002).

3 Distributional properties of realized volatility and returns

My data set consists of daily returns and realized volatilities for the Shanghai Composite Index

of Chinese stock market from 26 July 1999 to 26 April 2010. All the row data are form Chinese

data company, Biaopuyonghua. First, I calculate the relevant return which is formed as:

1 1,log( ) log( )tt t t

m m

r P P− −

= −

from high‐frequency data. The data is log transformed which is good to eliminate the extreme

value effection. Then, by summing all the 5‐min squared return in a day, the daily realized

volatility is constructed. So the daily realized volatility is formed as:

( ) 2, 1

1

( )m

mt t j

j

Q r t−=

= ∑

The daily absolute volatility is formed by summing 5‐minute absolute return in a day and the

calculational formula is

( ), 1

1

( )m

mt t j

j

P r t−=

= ∑

Despite the holiday and weekend in the data period, there are 124464 trading records, and

2593 days totally which 48 trading records per day. Time series plots of the variables including

6

relevant return, realized volatility, and absolute volatility are given below, from figure 2 to 5.

Figure 2: The time series plot of RV (realized volatility)

Figure3: The plot of square root of RV (realized volatility)

Figure 4： The plot of series RAV

The figures above shows that there is a bigger volatility form 2007 to 2010. And all the 3

different types of daily volatility measurement have a similar shape and few extreme values. All

the histograms of them show the distribution has a heavy tail. From the figures above, the

series of RAV and the square root of RV express a more detailed volatility with a lower weight of

extreme values. And in RV’s series plot, because the quadratic variation, the extreme values

become higher than other plots. I think that the point why the absolute return explain the

regression best among others regressors.

7

Figure 5 shows the log transformed return data, which is calculated by the formula

, 1 1log( ) log( )t t t tr P P− −= −.The plot shows that there is a bigger range between2007 and 2010

at the end part of time series. And the histogram tells us the data is aggregated at 0 point

symmetrically, but not in a normal distribution.

Figure 5: The plot of returned values by log formed price

Table 1 is the descriptive statistic result of RV, RAV and return. There isn’t a big distance

between the mean and median of these values. Both the range values and variances of the series

give us a impression of aggregated data pattern.

Talbe1: Brief statistical result of RV RAV and return RV, Square Root of RV and RAV are the daily volatility summed by intra‐day data

RV ( ), 1m

t tQ −

Square

Root of RV

( ), 1m

t tQ −

RAV ( ), 1m

t tP − Log Return

1,t tm

r−

Mean 1.78× 410− 0.0114 0.06 4.99× 610−

median 9.24× 510− 0.00961 0.05 ‐7.18× 610−

maximum 3.87× 310− 0.0623 0.36 0.088

Minimum 3.72× 610− 0.00193 0.01 ‐0.043

variance 6.31× 810− 4.75× 510− 0.014 5.08× 610−

8

4 The result of regressions

4.1 The results of daily volatility forecast with high‐frequency sample

There are 2 MIDAS regressions to be compared in the daily volatility forecast. They

are:

max

( ) ( ) 2, / , ( 1) /

0

( , )[ ]k

Hm mt H t H H H t k m t k m Ht

k

Q b k rμ φ θ ε+ − − −=

= + +∑ (1)

max

( ) ( ), / , ( 1) /

0

( , ) | |k

Hm mt H t H H H t k m t k m Ht

k

Q b k rμ φ θ ε+ − − −=

= + +∑ (2)

( ),

Hmt H tQ + is the predicted quadratic variation volatility. And it is defined as :

( ),

Hmt H tQ + = 2

( ) ( 1) ,( ) ( 2)1

[ ]Hm

t H j m t H j mj

r + − − + − −=∑

In the model (2), we get , 1/ 1/log( ) log( )t t m t t mr P P− −= − which is the return and specified

in all the models above. And tP is the price at time t.

With the using of past daily realized volatility and realized power, the regression models

have changed as:

max

( ) ( ), , 1

0

( , )k

Hm mt H t H H H t k t k Ht

k

Q b k Qμ φ θ ε+ − − −=

= + +∑ (3)

max

( ) ( ), , 1

0

( , )k

Hm mt H t H H H t k t k Ht

k

Q b k Pμ φ θ ε+ − − −=

= + +∑ (4)

In this part, I use 5‐min data to forecast daily volatility directly, which means ( ),

Hmt H tQ + and

( ),

Hmt H tP+ on the left side of models (3) and (4) are summed by 48 5‐min intra‐day data of a trading

day for H =1. With different lagged day, I compared with the result and try to find the befitting

lag number of the smallest MSE result. The lags I choose are a vector formed by series

1,5,10,15,…till 260, 53 numbers totally. When the lag is 1, means there are only 48 regression

variables ( )/ , ( 1) /

mt k m t k mr − − − in model (1) and (2) and max

k =1 in model (3) and (4).i

The in‐sample result is from the whole data period which was record from 26 July 1999 to

26 April 2010. Because of the same left side, so we can compare the 4 models by MSE shown by

9

in Figure 6. According to the left figure, model (1) has the better result. The figure also shows that

there is a same fitness trend along the lag increase. The lag day is beginning with 1 and

dramatically improved when lagged day is 5. There is a drop at 145 lagged days which may due to

the half year report in stock market, and an over 1‐year lagged days shows there isn’t a significant

year report effect through the regressions. The figure of lagged daily volatility is laid right which

has the similar trend to the left one. Both of them have the big drop at 145. But using daily

volatility has a smaller MSE result. Comparing the two plots, RAV (realized absolute return)

outperform others.

In all the regressions, the estimated values of Hμ and Hφ change little. The detailed

information can be attained in appendix tables.

Figure 6 : The brief in‐sample result of 1 day predict horizons with |r|,r ,RAV, RV

The two figures are the results of the four regression models, and the period is from 26 July 1999 to 26 April 2010. In these models H=1

Figure 7 is the MSE result of out‐of sampl with all the 4 types regressors at 1 day predict

horizons. The out‐of sample is a ten year dataset which is from 26th July 1999 to 26th July 2009. I

use first 9 years data get the estimated parameters and calculate the difference between the real

forecast volatilities and estimated ones in the 10th year. From the left plot, RV and RAV have a

very similar and nearly results. And the absolute 5‐min return performs better than squared

5‐min return very much. On the right side, the plot shows that RAV performs better than RV. Both

the two plots have a steady trend when lagged day increased to 10 and the don’t perform the

drop at about 145 lagged days happened in in‐sample result either. Compared with in‐sample

data result, out‐of sample has a bigger MSE. Generally, in 1 day forecast horizon, RAV performs

best.

10

Figure 7 : The brief out‐of‐sample result at 1 day predict horizons () with |r|,r ,RAV, RV

The two figures are the results of out‐of sample. I use the data from 26th July 1999 to 26th July 2008 to estimate parameters in models and calculate the

MSE(mean square error) in the period of 26th July 2008 to 26th July 2009. The forecast horizon is 1 day means H=1 in models. The left one shows the results of

all the regressors but RV and RAV are not clearly due to small difference compared with the y axis scale. The right side one is the compared result of RV and

RAV which RAV has better performance

4.2 The comparison of regression fitness at different horizons with fixed regressor

In general, there are 2 ways to forecast weeks volatility in my work. One is using past daily

volatility, like realized volatility and realized power. The other is using 5‐min data directly which

make the regression has a large number of lags.

I want to predict weekly future volatility which means H=5 in equation (3) and (4). The right

side of model (3), ( ), 1

mt k t kQ − − − = 2

( 1) , ( 2)1

[ ]m

t j m t j mj

r − − − −=∑ ,and , 1/ 1/log( ) log( )t t m t t mr P P− −= − .For

there are 48 5‐min records in one day, m =48 in the models. And on the left side, ( ),

Hmt H tQ + =

( ), 1

1

Hm

t j t jj

Q + + +=∑ which is the summed daily volatilities. And in formula (4), the right side is calculated

as ( ), 1

mt k t kP− − − = ( 1) , ( 2)

1

| |m

t j m t j mj

r − − − −=∑ ,by summed intra‐day absolute returns. And the left side is

the same as equation (3).

I choose one week daily lags first which means in equation (3) and (4) maxk =5 with 5 daily

volatility values used on the right side. But in equation (1) and (2), maxk =48×5=240, for using

the 5‐min data. Then I enlarge the number of lags to 50 daily volatilities according to Ghysels et

11

al.(2003). So in (3) and (4), maxk =50, and in (1) to (2), maxk =48×50=2400. At last, I choose 20

lags on the right side, so maxk =200 in (3) and (4), 4800 5‐min data in (1) to (2). I calculate the

estimated values from 9 years data and then get MSE of the 4 models at the lag point 10, 20, 30,

until 200 step 10. The results are detailed in appendix and according to the results, I draw figures

giving a directly idea.

In this part, I focus the question how the fitness performs in different forecast horizons in

in‐sample result. From figure 8 to 11, there are the 4 different regressors, absolute return,

squared return, daily realized volatility and daily realized power. In each regressors, I give 2

pictures show the fitness by MSE values and R square values. All the figures show a similar

pattern of series trend.

In MSE part, short forecast period has a smaller mean square error. According to different

predict horizons that month volatility summed by the longest returns which means the more

regression error summed and this result to the largest mean square error. So, R square values

may be is the better one in the comparison of fitness through different predict horizons.

The R squared part in figure 8 and 9 tell us monthly horizon has the highest value while 2

weeks horizon is the second one. The 1 week and 2 weeks horizons have a steady trend when the

lags increase to 30 and 2 weeks horizon is better fitted then 1 week through all the figures. The 3

weeks horizon and 4 weeks horizon have a seasonal pattern in the plots. The absolute return and

squared return outperform in monthly predict horizons, and perform quite similar at 2 week and

3 week horizons. Though the smallest MSE value is in 1 week horizons, R square shows opposite

that in 1 week horizons it has the lowest values. Squared return value has the similar result with

absolute return, while it has a big diffusion in 3 week horizons. Both the 2 daily volatilities RV and

RAV are have good fitness from 2 weeks to 4 weeks. In 3 weeks, the period of seasonal effect is

15 day which is the same as the predict horizon. So, with lag number at 10,25,40,…., MSE has the

lowest values. And in one month predict horizon, the period is 20 days, that at 10,30,50 lags, it

has the smallest error. Compared with the MSE and R square values, RAV has the smallest MSE

values in all the horizons from in‐sample result. The detailed values can be approach in appendix.

Figure 8: The In‐Sample MSE Fitness of 4 Forecast Horizons With regressor |r|

12

Figure 9: The In‐Sample Fit of 4 Forecast Horizons With regressor 2r

Figure 10: The In‐Sample Fit of 4 Forecast Horizons With regressor RV

Figure 11: The In‐Sample Fit of 1 Month Forecast Horizons With regressor RAV

13

The 4.3 Out‐Of Sample Comparison of different regressors with fixed predict horizons

In this part, I focus on the performance of different regressors in all models through out‐of

sample result.

Figure 12 shows us briefly out‐of‐sample result of the changing fitness when the lags are

increasing. Compared with the 4 different regressors in all the predict horizons, RAV plays the

best. In all the pictures, RV and RAV have a very nearly MSE values. And when the lag increased

to 20, in the first 2 plots, the MSE values become steady. Squared 5‐min return performs worst

which has the largest MSE values. At 3 weeks and 1 month predicted horizons, there is a seasonal

pattern in the figures with all regressors from the third and forth figures. The last 2 plots are the

seasonal pattern of absolute 5‐min return which is too small to shown in first 4 plots. In 3 weeks

horizons, my plots show the period is 30 days in both absolute return and squared return, but in

fact, the smallest period is 15 days, the same as predicted horizons. And in month horizons, the

lagged period is 20 days. But model 1 and 2, using 5‐min return, has the opposite best fitted

seasonal lag day value. Like in 3 weeks horizons, the absolute 5‐min returns at 20, 35, 50… get the

smallest MSE values, while the squared return at the same point has the largest periodic MSE

values.

When the lags changed to 130, the MSE values in the absolute 5‐min return, see from the last

2 plots, become decline slightly in 3 weeks and 1 month forecast horizons. And in Figure 13, it

shows the trend of RV and RAV performance.

Figure 12: The MSE Out‐of‐Sample Fit of Weeks Forecast with different lags The results are obtained using the sample, from 26 July 1999, to 26 April 2010 and shown with MSE values.

14

Figure 13 is the plots of RV and RAV in all the horizons. From figure 12, we know the

comparisons between 5‐min data and daily volatility and daily volatilities performs better. And in

figure 13, we can see clearly how the daily volatility works in MIDAS models. In all the plots, RAV

performs better than RV and both of the two volatilities have the same fitness trend. In 3 weeks

and 1 month predict horizons, RV isn’t steady enough with increasing lagged day while RAV not.

They all have seasonal pattern in 3 weeks and 1 month horizons with the same period time as

predict horizons, 15 days in 3 weeks and 20 days in 1 month horizons.

So generally, from figure 8 and 9, with the increase forecast horizons, the MSE values of all

the regressors are also increased. And the increasing of lagged numbers doesn’t have a

significant effect of the improvement of fitness especially with the use of RAV. The 50 lags in

Ghysels et al.(2004) can be enough in the predicting. In the result of out‐of sample, RAV has the

best regression al result.

Figure 13: The Out‐of‐Sample MSE Fitness of all Forecast Horizons with RV and RAV

The results are obtained using the sample, from 26 July 1999, to 26 April 2010 and shown with MSE values

15

5. Conclusions

MIDAS regressions is a very good method to forecast future volatility and my approach is by

comparing forecasting models with two different measurements of volatility, frequencies and lag

lengths. The main focus of this paper is the forecasting with high‐frequency sample which is using

5‐min data directly. Because MIDAS framework can find a good use in any empirical investigation,

so I can compare high‐frequency measurement with daily measurement, besides, it’s allowing

different measurement of volatility, the approach by comparing different regressors is available.

Through all the comparisons in this paper, power realized volatility has the best fitness and

simple, robust and parsimonious. There are several findings from predictability of daily to monthly realized volatility of Chinese

market. First, the squared return values outperforms than absolute variation in forecast daily

volatility based on high‐frequency data with 1 to 260 days lag. Though the fitness result isn’t

good enough, but MIDAS method is quit robust that there isn’t a big difference between others

with different frequency and forecast horizons. And I think, with the 50 lag number, the

regression result can get a steady fitness and the short length of lags in daily forecast regression

is useful, despite the long lags increase the fitness very slowly. Second, the intra‐day data used in

MIDAS forecast is reliable for power realized volatility and absolute return. Using daily power

realized volatility is more reliable with 5‐min data directly and other regressors. Third in 3 weeks

and 4 weeks horizons, all the regressors exist a seasonal effect which has the same period time

with predicting horizon. So, my suggestion is trying different lags near 50 (which is long enough

to get a good fitness result) and get the best fitness lag point. Last, when fix the regressors,

different predict horizons has a different fitness result. But the conclusion is that 2 week to

month horizons outperform than 1 week though 1 week has the lowest MSE values in all the

models.

16

References

[1] Alizadeh, S., M. Brandt and F. X. Diebold, (2002), “Range‐based estimation of stochastic

volatility models”, Journal of Finance, 57, 1047‐1091

[2] Andersen, T. and T. Bollerslev (1998), ”Answering the Skeptics: Yes, Standard Volatility

Models Do provide Accurate Forecasts”, International Economic Review,39,885‐905

[3] Andersen, T., T. Bollerslev, F. X. Diebold and P. Labys (2001), “The Distribution of Exchange

Rate Volatility”, Journal of American Statistical Association, 96,42‐55

[4] Andersen, T.G., Bollerslev, T. and F. X. Diebold, (2002),”Parametric and Nonparametric

Volatility Measurement，” in L.P. Hansen and Y. Ait‐Sahalia (eds.), Handbook of Financial

Econometrics, Amsterdam: North‐Holland, forthcoming.

[5] Andersen, T., T. Bollerslev, F. X. Diebold and P. Labys (2003), ” Modeling and Forecasting

Realized Volatility”, Econometrica, 71, 529‐626

[6] Barnodorff‐Nielsen, O. and N. Shephard (2001), “Non‐Gaussian Ornstein‐Uhlenbeck‐based

models and some of their uses in financial economics (with discussion),”Journal of the Royal

Statistical Society, Series B, 63, 167‐241.

[7] Barnodorff‐Nielsen, O. and N. Shephard (2002a), “Econometric analysis of realized volatility

and its use in estimating stochastic volatility models”, Journal of the Royal Statistical Society,

Series B, 64,25‐280

[8] Barnodorff‐Nielsen, O. and N. Shephard (2003a) “How accurate is the asymptotic

approximation to the distribution of realized volatility?” in D.W.K. Andrews, J. Powell, P.

Ruud and J. Stock (ed.), Identification and Inference for Econometric Models, A Festschrift for

Tom Rothenberg, Cambridge University Press.

[9] Barndorff‐Nielsen, O. and N. Shephard (2003b), “Realised power variation and stochastic

volatility” Bernoulli 9, 243‐265

[10] Barnodorff‐Nielsen, O. and N. Shephard (2004) “Power and bipower variation with stochastic volatility and jumps” (with discussion) Journal of Financial Econometrics, 2, 1‐48

[11] Ding, Z., C. W.J Granger and R. F. Engle (1993), “A long memory property of stock market

returns and a new models”, Journal of Empirical Finance，1,83‐106

[12] Engle, R. F. and G. Gallo (2003), “A Multiple Indicator Model for Volatility Using Intra Daily

Data”, Discussion Paper NYU and University di Firenze.

[13] Engle, R.F. (1982), “Autoregressive Conditional Heteroscedasticity with Estimates of the

Variance of United Kingdom Inflation”, Econometrica, 50, 987‐1008.

[14] Gallant, A. R., C.‐T. Hsu, and Tauchen, G. (1999), “Using Daily Range Data to Calibrate Bolatility Diffusions and Extract the Forward Integrated Volatility”, Review of Economics and

Statistics, 85,616‐631.

[15] Ghysels, E., P. Santa‐Clara and R. Valkanov (2002a), “There is a Risk‐return Tradeoff after all,” Journal of Financial Economics

[16] Ghysels, E., P. Santa‐Clara and R. Valkanov (2002b), “The MIDAS Touch: Mixed Data Sampling

Regrssion,”

[17] Ghysels, E., P. Santa‐Clara, A. Sinko and R. Valkanov (2004), “MIDAS Regressions: Results and

New Directions”

[18] Woerner J. (2002), “Variational sums and power variation: a unifying approach to model

selection and estimation in semimartingale models”, Discussion Paper, Oxford University.

17

Appendix

Table 1: The detailed results of In‐sample‐fit of regression 1 and 2 with H=1 1 day predicted horizons

18

Table 2: The in‐sample result of 1 day predict horizon with RV and RAV

19

Table 3: The out‐of‐sample result of 1 day predict horizon with RAV, squared return, RV, and absolute return

Table4: The in‐sample results of 1 week predicted horizon with regressors |r|, r , RAV, RV

20

Table5: The in‐sample results of 2 weeks predicted horizon with regressors |r|, r , RAV, RV

21

Table6: The in‐sample results of 3 weeks predicted horizon with regressors |r|, r , RAV, RV

22

Table7: The in‐sample results of 1 month predicted horizon with regressors |r|, r , RAV, RV

23

Table8: The MSE results of Out‐of sample by RV and RAV at all forecast horizons

Table9: The MSE results of Out‐of sample by |r|, r at all forecast horizons

R code

24

25

26

27

##########OUT OF SAMPLE###################

outresult_absr=function(l_day,ylag){

MSE=c()

ret=c(0,return_diff_l)

for(j in 1:length(l_day)){

agy=ylag*48

agx=l_day[j]*48

error=c()

x=datageneration_x(ret[1:104016],agy,agx)

y=datageneration_y(ret[1:104016],agy,agx)

a=bnls_restr(x,y)

for(i in 1:11664){

hv=a[1]+a[2]*(t(as.matrix(ret[(104015+i):(104016+i‐agx)])

)%*%as.matrix(beta_weights(agx,1,a[3])))

v=sum(ret[(11663+i):(11664+i+agy)])

error[i]=hv‐v

}

MSE[j]=sum(error^2)/243

}

return(MSE)

}

outresult_rv=function(l_day,ylag){

MSE=c()


agy=ylag

agx=l_day[j]

error=c()

x=datageneration_x(rv[1:2167],agy,agx)

y=datageneration_y_2(rv[1:2167],agy,agx)

a=bnls_restr(x,y)

for(i in 1:243){

hv=a[1]+a[2]*(t(as.matrix(rv[(2166+i):(2167+i‐agx)]

))%*%as.matrix(beta_weights(agx,1,a[3])))

v=sum(rv[(2167+i):(2166+i+agy)])

error[i]=hv‐v

}


}

return(MSE)

}

i In my R code, the regression command is “agy=48, agx=48”

outresult_sqrr=function(l_day,ylag){

MSE=c()

ret=c(0,return_diff_l)


agy=ylag*48

agx=l_day[j]*48

error=c()

x=datageneration_x_2(ret[1:104016],agy,agx)

y=datageneration_y(ret[1:104016],agy,agx)

a=bnls_restr(x,y)

for(i in 1:11664){

hv=a[1]+a[2]*(t(as.matrix(ret[(104015+i):(104016+i‐agx)]))

%*%as.matrix(beta_weights(agx,1,a[3])))

v=sum(ret[(11663+i):(11664+i+agy)])

error[i]=hv‐v

}


}

return(MSE)

}

outresult_rav=function(l_day,ylag){

MSE=c()


agy=ylag

agx=l_day[j]

error=c()

x=datageneration_x(rav[1:2167],agy,agx)

y=datageneration_y_2(rv[1:2167],agy,agx)

a=bnls_restr(x,y)

for(i in 1:243){

hv=a[1]+a[2]*(t(as.matrix(rav[(2166+i):(2167+i‐agx)]))%*%

as.matrix(beta_weights(agx,1,a[3])))

v=sum(rv[(2167+i):(2166+i+agy)])

error[i]=hv‐v

}


}

return(MSE)

Documents

MIDAS Predicting Volatility at Different Frequencies327053/FULLTEXT01.pdf · MIDAS Predicting Volatility at Different Frequencies ... rPP −− = −. The ... section 2 introduce