Time series 6 ARIMA Modelling - Thus Spake VM | "Aano ... · PDF file11/17/2008 3 Vijayamohan: CDS MPhil: Time Series 6 9 17 November 2008 Identification (Model Selection – Check

11/17/2008

1

17 November 2008Vijayamohan: CDS MPhil: Time Series 6 1

Time SeriesTime Series EconometricsEconometrics

66

VijayamohananVijayamohanan PillaiPillai NN


BoxBox--JenkinsJenkinsMethodology:Methodology:

ARIMAARIMA ModellingModelling


ARIMAARIMA ModellingModelling: Box: Box –– Jenkins Methodology:Jenkins Methodology:

AR models:AR models: first introduced byfirst introduced by YuleYule (1926)(1926)

and later generalized by Walker (1931);and later generalized by Walker (1931);

MA modelsMA models first used byfirst used by SlutzkySlutzky (1937).(1937).

WoldWold (1938):(1938): theoretical foundationtheoretical foundation ofof

combinedcombined ARMA processes.ARMA processes.

GeorgeGeorge UdnyUdny YuleYule (1871 - 1951) Scottish Statistician

Evgeny Evgenievich Slutzky (1880 – 1948) Russian Statistician

Herman Ole Andreas Wold (1908 – 1992) Swedish Statistician


GeorgeGeorge BoxBox and Gwilymand Gwilym JenkinsJenkins (1970; 1976):(1970; 1976):

Comprehensively put together all the threads ofComprehensively put together all the threads of

ARIMA modelling.ARIMA modelling.

The term ‘The term ‘time series/ARIMA modelling’time series/ARIMA modelling’

== ‘‘BoxBox--Jenkins methodology’.Jenkins methodology’.


11/17/2008

2



Objective of BoxObjective of Box –– Jenkins methodologyJenkins methodology: obtain: obtain aa

parsimonious modelparsimonious model: one that describes all the: one that describes all the

features of the data of interest usingfeatures of the data of interest using as fewas few

parametersparameters (or(or as simple a modelas simple a model)) as possibleas possible..

Ockham’s razor:Ockham’s razor: LexLex parsimoniaeparsimoniae::

‘‘EntiaEntia nonnon suntsunt multiplicandamultiplicanda praeterpraeternecessitatemnecessitatem’:’:

((EntitiesEntities are not to be multiplied beyondare not to be multiplied beyondnecessitynecessity).).

WilliamWilliam of Ockhamof Ockham (1285(1285 –– 1347/49):1347/49): EnglishEnglishFranciscan PhilosopherFranciscan Philosopher


Variance of estimatorsVariance of estimators isis inversely proportionalinversely proportional

toto number of degrees of freedom:number of degrees of freedom:

wherewhere

LargerLarger kk higher SEhigher SE smallersmaller tt--valuevalue

not rejectingnot rejecting a false null:a false null:

Type 2 errorType 2 error..

1'2 )(ˆ)̂( XXVar u

kTut

tu 22 ˆ̂

ARIMAARIMA ModellingModelling::

BoxBox –– Jenkins Methodology:Jenkins Methodology:


ARIMAARIMA ModellingModelling::

BoxBox –– Jenkins Methodology:Jenkins Methodology:

Basic steps:Basic steps:

1.1. Identification of a tentative model;Identification of a tentative model;

SuccessiveSuccessive differencing to achievedifferencing to achieve stationaritystationarity;;

2. Estimation of the model; and2. Estimation of the model; and

3. Diagnostic checking.3. Diagnostic checking.

Involves more of aInvolves more of a judgmental procedurejudgmental procedure thanthan

the use of anythe use of any clearclear--cut rules:cut rules: Trial and errorTrial and error..


Plot the series

Does the series appearstationary?

Obtain ACF and PACF

Does the Correlogram (ACF) decay to zero?

Identification (Model Selection :Check ACF and PACF)

Apply transformation/ Differencing

No

No

Yes

Yes

BoxBox--Jenkins MethodologyJenkins Methodology

11/17/2008

3


Identification (Model Selection –Check ACF and PACF)

Is there a sharp cut-off in ACF?

MAMA

Estimate parameter values

Diagnosis:Are the residuals white noise?

Check ACF and PACF

Is there a sharp cut-off in PACF?

ARMAARMAARAR

Forecast

Modifymodel

Yes No

Yes

Yes

No

No


Process ACF PACF

White noiseARIMA(0,0,0)

No significantspikes

No significant spikes

DSP ARIMA(0,1,0) Slow decay One significant spike

Autoregressive processes ARIMA(p,0,0)

AR(1) 11 > 0> 0 Exponential decay:+ve spike

1 +ve spike at lag 1

AR(1) 11 < 0< 0 Oscillatory decay,starts with –ve spike

1 –ve spike at lag 1.

AR(2)

11,, 22 > 0> 0

Exponential decay,+ve spikes.

2 +ve spikes at lags 1and 2.

AR(2) 11 <0,<0,

22 > 0> 0

Oscillatingexponential decay.

1 negative spike at lag1; and 1 +ve spike atlag 2.

AR(p)AR(p) Decays toward zero;coefficients mayoscillate.

Spikes up to lag p.

Identification of ARMA models:Identification of ARMA models:


Process ACF PACF

Moving Average processes ARIMA(0,0,q)

MA(1) 11 > 0> 0 1 –ve spike at lag 1. Exponential decay of–ve spikes

MA(1) 11 < 0< 0 1 +ve spike at lag 1 Oscillatoryexponential decay of+ve and –ve spikes

MA(2)

11,, 22 > 0> 02 –ve spikes at lags1 and 2.

Exponential decay of–ve spikes.

MA(2)

11,, 22 <0<02 +ve spikes at lags1 and 2.

Oscillatingexponential decay of+ve and –ve spikes.

MA(q) Spikes up to lag q. Exponential/oscillating decay.




Process ACF PACF

Hybrid processes ARIMA(p,0,q)

ARIMA(1,0,1)

11 > 0,> 0, 11 > 0> 0

Exponential decay of+ve spikes


ARIMA(1,0,1)

11 > 0,> 0, 11 < 0< 0


Oscillatoryexponential decay of+ve and –ve spikes

ARIMA(1,0,1)

11 < 0,< 0, 11 > 0> 0

Oscillatoryexponential decay

Exponential decay of–ve spikes.

ARIMA(1,0,1)

11 < 0,< 0, 11 < 0< 0

Oscillatoryexponential decay of–ve and +ve spikes.

Oscillatingexponential decay of–ve and+ve spikes.

ARIMA(p,d,q) Decay (either director oscillatory)beginning at lag q.

Decay (either director oscillatory)beginning at lag p.

11/17/2008

4



Using ACF and PACF:

Apply Tests of Significance on ACF and PACFApply Tests of Significance on ACF and PACF((i.e.,i.e., forfor StationarityStationarity):):

For Individual ACF/PACF:For Individual ACF/PACF:

95% confidence interval is:95% confidence interval is:

HH00: sample: sample ACC/ACC/PACCPACC = 0 is rejected,= 0 is rejected,

ifif itit fallsfalls outsideoutside this region for anythis region for any k.k.

T

196.1


ACF and PACF for a sample of 100 observations:ACF and PACF for a sample of 100 observations:

Lag 1 2 3 4 5 6 7 8 9ACF 0.321 0.301 0.215 0.142 -0.02 -0.01 0.005 0.001 0.011PACF 0.321 0.155 0.131 0.102 0.085 0.051 0.008 0.014 0.005

95% confidence interval95% confidence interval

== 1.96/10 = (1.96/10 = (-- 0.196, +0.196).0.196, +0.196).

ACF fast declining and PACF one spike at lag 1ACF fast declining and PACF one spike at lag 1

AR(1)AR(1)



Lag 1 2 3 4 5 6 7 8 9ACF 0.321 0.155 0.131 0.102 0.085 0.051 0.008 0.014 0.005

PACF 0.321 0.301 0.215 0.142 -0.02 -0.01 0.005 0.001 0.011

ACF and PACF for a sample of 100 observations:

95% confidence interval95% confidence interval

== 1.96/10 = (1.96/10 = (-- 0.196, +0.196).0.196, +0.196).

ACF one spike at lag 1 and PACF fast decliningACF one spike at lag 1 and PACF fast declining

MA(1)MA(1)




2. Estimation:2. Estimation:

ARAR Model:Model: straightforwardstraightforward::OLS methodOLS method;;

MAMA Model:Model:ML methodML method;;

ARMAARMA Model:Model:

ML methodML method for thefor the MA componentMA component..

11/17/2008

5


3. Diagnostic checking3. Diagnostic checking:: ‘Model adequacy’ tests‘Model adequacy’ tests::

(a) ‘Residual(a) ‘Residual analysis’analysis’ andand

(b) ‘Trial(b) ‘Trial overfittingoverfitting’’ ((Box and JenkinsBox and Jenkins))

(i)(i) ‘Trial‘Trial OverfittingOverfitting’’::

IfIf anan ARMA(p, q)ARMA(p, q) model is chosen,model is chosen, alsoalsoestimate anestimate an ARMA(p+1, q)ARMA(p+1, q) modelmodel and anand anARMA(p, q+1)ARMA(p, q+1) model andmodel and test fortest forsignificance of the additional parameterssignificance of the additional parameters..



(ii)(ii) ‘Residual analysis’‘Residual analysis’ (Refer: My working paper on(Refer: My working paper on‘Electricity Demand Analysis and Forecasting‘Electricity Demand Analysis and Forecasting’’))

Residuals of an adequate modelResiduals of an adequate model == white noisewhite noise;;purelypurelyrandomrandom;;

Plot of residualsPlot of residuals;; check for outlierscheck for outliers;;

No serial correlationNo serial correlation;; examine ACF; PACF;examine ACF; PACF;

LjungLjung--Box (1978) portmanteau statisticBox (1978) portmanteau statistic

QQ** 22((kk –– pp –– qq)) forfor an ARMA(an ARMA(pp,, qq),),

ifif correctly specifiedcorrectly specified..

Small Q* valueSmall Q* value oror large plarge p--valuevalue:: model adequacymodel adequacy..



Residual analysis

Problem with Q*: Not reliable: (same as with DWwith lagged endogenous variable; so Durbin’s h):

Residual autocorrelations are biased towards zero,

when lagged dependent variable is included as

regressors in the model; e.g., DW 2, even when

residuals are serially correlated.

Lagrange Multiplier F-test (Harvey 1981):

Small F-value or large p-vale No autocorrelation.

Harvey 1981, The Econometric Analysis of TimeSeries, Philip Allen, Deddington.



Model Selection Criteria:Model Selection Criteria:

If aIf a MA(2) modelMA(2) model providesprovides the same fit as an AR(10)the same fit as an AR(10)modelmodel,, select the firstselect the first..

Adding more lags (Adding more lags (pp,, qq)) necessarilynecessarily reduces residualreduces residual

sum of squaressum of squares: R: R22::

goodness of fitgoodness of fit ;; but degrees of freedombut degrees of freedom ::

TradeTrade--offoff betweenbetween ‘Goodness of fit’‘Goodness of fit’ andand ‘Parsimony’‘Parsimony’::

VariousVarious ‘Information Criteria’‘Information Criteria’ that trade off the two:that trade off the two:


11/17/2008

6


Theory of estimationTheory of estimation::

Information contentInformation content (of a parameter)(of a parameter)

in a random sample isin a random sample is representedrepresented byby

variancevariance of its (unbiased) estimator:of its (unbiased) estimator:

small variancesmall variance large informationlarge information..



‘Information Criteria’‘Information Criteria’: In general: In general

IC(IC(kk) = for) = for kk = 1, …,= 1, …, pp++qq

wherewhere is theis the estimated error varianceestimated error variance::

(RSS/((RSS/(TT –– kk);); andand

kk{{ff((TT)} =)} = ‘Penalty’ function‘Penalty’ function for increasing the order offor increasing the order of

model, for the loss of degrees of freedommodel, for the loss of degrees of freedom..

We choose theWe choose the ARMA modelARMA model with thewith the lowest IClowest IC..

)},({ˆln 2 TfkT

2̂



‘Information Criteria’‘Information Criteria’: In general: In general

IC(IC(kk) = for) = for kk = 1, …,= 1, …, pp++qq

(1)(1) ff((TT) =) = 22 AkaikeAkaike (1974) IC(1974) IC ==

(2)(2) ff((TT) =) = lnlnTT Schwartz (1978) Bayesian ICSchwartz (1978) Bayesian IC::

(3)(3) ff((TT) =) = 22 lnln((lnlnTT)) HannanHannan--Quinn (1979) ICQuinn (1979) IC::

)},({ˆln 2 TfkT

kT 2ˆln 2

TkT lnˆln 2

)ln(ln2ˆln 2 TkT



‘Information Criteria’:‘Information Criteria’:

(1)(1) ff((TT) =) = 22 AkaikeAkaike (1974) IC(1974) IC ==

(2)(2) ff((TT) =) = lnlnTT Schwartz (1978) Bayesian ICSchwartz (1978) Bayesian IC::

BICBIC givesgives more weight tomore weight to kk than AICthan AIC ifif TT > 7> 7::

anan inin kk requiresrequires a largera larger inin

underunder BICBIC than underthan under AICAIC..

AsAs lnlnTT > 2,> 2, ((TT > 7> 7), BIC), BIC alwaysalways select aselect a

more parsimonious modelmore parsimonious model than AIC.than AIC.

2̂

kT 2ˆln 2

TkT lnˆln 2


11/17/2008

7


‘Information Criteria’:‘Information Criteria’:

(3)(3) ff((TT) =) = 22 lnln((lnlnTT)) HannanHannan--Quinn (1979) ICQuinn (1979) IC::

For HQIC,For HQIC, weight onweight on kk isis greater than 2greater than 2, if, if TT > 15> 15..

)ln(ln2ˆln 2 TkT



Model selection for the Growth of the Annual Index of Output:Model selection for the Growth of the Annual Index of Output:

19551955 –– 19801980

SESE ACF 0.339 –0.24 –0.524 –0.294

0.1960.196 PACFPACF 0.3390.339 ––0.4020.402 ––0.3680.368 ––0.0810.081

Model RSS k LM FAC AIC BIC

AR(1) 0.7951 1 4.226 –88.67 –87.41

AR(2) 0.6715 2 3.003 –91.06 –88.55

MA(1) 0.7495 1 0.057 –90.21 –88.95

MA(2) 0.7377 2 1.732 –88.62 –86.1

ARMA(1,1) 0.7501 2 2.301 –88.19 –85.67

ARMA(2,1) 0.5117 3 4.398** –96.03 –92.36

ARMA(1,2) 0.6050 3 7.888** –91.78 –88

ARMA(2,2) 0.5153 4 3.719 –93.95 –88.92

Note: ** = Significant at 5 % level. Source:Note: ** = Significant at 5 % level. Source: FransesFranses, Philip Hans, 1998,, Philip Hans, 1998,Time Series Models for Business and Economic ForecastingTime Series Models for Business and Economic Forecasting, CUP: Table 3.2., CUP: Table 3.2.


BoxBox--JenkinsJenkinsMethodology:Methodology:

ARIMAARIMA ModellingModellingExamplesExamples


ACFACF PACFPACF

YYtt

11/17/2008

8


YYtt

ACFACF PACFPACF


EQ( 1) Modelling yytt by OLS (using Data1)The estimation sample is: 2 to 200

Coefficient Std.Error t-value t-probyytt –– 11 0.605620 0.05675 10.7 0.000Constant 0.0120266 0.06819 0.176 0.860

sigma 0.961773 RSS 182.226355R^2 0.36634 F(1,197) = 113.9 [0.000]**log-likelihood -273.607 DW 1.93no. of observations 199 no. of parameters 2

AR 1-2 test: F(2,195) = 1.4581 [0.2352]ARCH 1-1 test: F(1,195) = 0.70516 [0.4021]Normality test: Chi^2(2) = 0.70709 [0.7022]hetero test: F(2,194) = 0.59821 [0.5508]hetero-X test: F(2,194) = 0.59821 [0.5508]RESET test: F(1,196) = 2.2481 [0.1354]


ACFACF PACFPACF

ResidualsResiduals


EQ( 2) Modelling yytt by OLS (using Data1)The estimation sample is: 3 to 200

Coefficient Std.Error t-value t-probyytt –– 11 0.639012 0.07156 8.93 0.000yytt –– 22 -0.0551546 0.07156 -0.771 0.442Constant 0.0125556 0.06861 0.183 0.855

sigma 0.965223 RSS 181.672753R^2 0.368264 F(2,195) = 56.84 [0.000]**log-likelihood -272.43 DW 2.01no. of observations 198 no. of parameters 3

AR 1-2 test: F(2,193) = 1.1629 [0.3148]ARCH 1-1 test: F(1,193) = 0.46776 [0.4948]Normality test: Chi^2(2) = 1.2041 [0.5477]hetero test: F(4,190) = 2.3548 [0.0553]hetero-X test: F(5,189) = 2.0145 [0.0784]RESET test: F(1,194) = 2.5648 [0.1109]

11/17/2008

9


PACFPACFACFACF

XXtt


ACFACF PACFPACF

XXtt


EQ(13) Modelling XXtt by RALS (using Data1)The estimation sample is: 3 to 200

Coefficient Std.Error t-value t-probConstant 0.00838088 0.05187 0.162 0.872Uhat_1 -0.326969 0.06741 -4.85 0.000

sigma 0.96844 RSS 183.823765no. of observations 198 n o. of parameters 2mean(XXtt ) 0.00750231 var(XXtt ) 1.03984

Roots of error polynomial: real imag modulus0.32697 0.00000 0.32697

ARCH 1-1 test: F(1,194) = 0.98408 [0.3224]Normality test: Chi^2(2) = 0.57691 [0.7494]


ACFACF PACFPACF

ResidualsResiduals

11/17/2008

10


YYtt

ACFACF PACFPACF


EQ(10) Modelling yytt by RALS (using Data1)The estimation sample is: 3 to 200

Coefficient Std.Error t-value t-probyytt –– 11 0.544175 0.08824 6.17 0.000Constant 0.0103062 0.05609 0.184 0.854Uhat_1 -0.218511 0.1026 -2.13 0.034

sigma 0.961309 RSS 180.202435no. of observations 198 no. of parameters 3

Roots of error polynomial: real imag modulus0.21851 0.00000 0.21851

ARCH 1-1 test: F(1,193) = 0.84109 [0.3602]Normality test: Chi^2(2) = 0.90110 [0.6373]hetero test: F(2,193) = 1.7595 [0.1749]hetero-X test: F(2,193) = 1.7595 [0.1749]


ResidualsResiduals

ACFACF

PACFPACF

17 November 2008 Vijayamohan: CDS MPhil: TimeSeries 6

40

Documents

Time series 6 ARIMA Modelling - Thus Spake VM | "Aano ... · PDF file11/17/2008 3 Vijayamohan: CDS MPhil: Time Series 6 9 17 November 2008 Identification (Model Selection – Check