ARIMA AR MA ARMA Models

Stationary Time Series Models

(We’ll see non-stationary models later in the course)Univariate Time Series Analysis

ARIMA Models

Stationary Series Review from earlier lectures:

A series is covariance stationary when Mean: E(Yt) = u Variance: Var(Yt) = E (Yt – u)2 = σ2

Covariance Cov(Yt,Yt-k) = Constant for all t and k≠0.

Shocks to a stationary series dissipate over time and the long term forecast of a series will converge to the unconditional mean of the series. The series is said to exhibit mean reversion.

Has a finite variance that is time invariant• Has a theoretical covariance between values of yt that

depends only on the difference apart in time

A white noise process is one with (virtually) no discernible structure. A definition of a white noise process is Mean: E(Yt) = u

Variance: Var(Yt) = E (Yt – u)2 = σ2

Covariance: Cov(Yt,Yt-k) = 0 for all t and k≠0.

A White Noise Process

ARIMA Models

ARMA models Introduced by Box and Jenkins (1976)

An approach for analysing stationary time series data, if the series is I(1) then the approach can be applied to the first difference of the data. AR = Autoregressive I = Integrated MA = Moving Average

A first order autoregressive process AR(1) can be expressed as yt = α0 + α1yt-1 + et

et = white noise error term

Similarly and AR(2) process can be expressed as yt = α0 + α1yt-1 + α2yt-2 + et

In general an AR(p) process is of the form yt = α0 + α1yt-1 + α2yt-2 + … + αpyt-p + et

The yt is said to follow an AR(p) process

Autoregressive Processes

Moving Average Process

MA(1) yt = α +β0et + β1et-1

et is a white noise (stationary process) and yt is a moving average of this random process

Maybe something like the change in a stock price in hourly data. Change should be mostly random but the previous shock may be taking time to dissipate fully (Not in a fully efficient market….)

MA(2) yt = α +β0et + β1et-1 + β2et-2

MA(q) yt = α +β0et + β1et-1+ …+βqet-q

MA(q) process is stationary as it is linear combination of stationary variables, however an MA(q) process is not necessarily a white noise process

Generally our series may be comprised of AR and MA components [when choosing our models we will

include both and then try to reduce toward a succinct model]

For a series that exhibits both AR and MA characteristics we can combine the AR(p) and MA(q) models, to obtain an ARMA(1,1) model for example

yt = w + α1yt-1 + β0et + β1et-1

or more generally an ARMA(p,q) model:

yt = w + α1yt-1 + α2yt-2 + … + αpyt-p + β0et + β1et-1+ …+βpet-q

Autoregressive Moving Average Process (ARMA)

We can use backwards substitution to transform an AR(1) process into an MA(∞) process [or MA(∞) to AR(1)]

An AR process as an MA(∞) process

For an AR(1) process with no constant let yt = α1yt-1 + et then,

y1 = α1y0 + e1 and, y2 = α1y1 + e2

= α1(α1y0 + e1) + e2

= α12y0 + α1e1 + e2

By continued substitution yt = α1

ty0 + As t→∞ and │α│< 1, then yt = , which is an MA(∞) process

Note:│α│< 1 since we said at the start the series is stationary [This isn’t true for non stationary series!!!]

iti

ie

01

iti

ie

01

Proof that an AR(1) process, yt = α0 + α1yt-1 + et, is stationary if │α1│<1

{Lots of maths – try to follow the logic though! (On Monday we showed that a series with a unit root was non-stationary)}

Requirement 1: Mean must be constant

Stationarity Conditions for an AR(1) process Let yt = α0 + α1yt-1 + et Given the initial value of y0 it follows that y1 will be given by

y1 = α0 + α1y0 + e1

Then: y2 = α0 + α1y1 + e2

= α0 + α1 (α0 + α1y0 + e1) + e2

= α0 + α1α0 + α12y0 + α1e1 + e2

y3 = α0 + α1y2 + e3

= α0 + α1(α0 + α1α0 + α12y0 + α1e1 + e2) + e3

= α0 + α1α0 + α12α0 + α1

3y0 + α12e1 + α1e2 + e3

= [α0 + α1α0 + α12α0] + [α1

3y0 ] + [α12e1 + α1e2 + e3]

Continuing the substitution we can write

Yt =

i.e. a solution to the AR(1) process in terms of its initial condition, y0.

Could use Beta instead of Alpha…..

1

1

010

1

0101

t

t

i

it

i

ti ey

Stationarity Conditions for an AR(1) process Yt =

Consider the Mean of the AR(1) process If │α1│<1 and allowing t→∞

Note:

Then

Taking expectations, for sufficiently large values of t the second term goes to zero

E(yt) = α0/(1-α1) The mean of yt is finite and time independent Recall: This was the first requirement for stationarity

iti

it ey

01

1

0

1

1

1

010

1

0101

t

t

i

it

i

ti ey

0 11 1

1

i

i

Requirement 2: Variance must be constant / non-time dependent

Stationarity Conditions for an AR(1) process Now we want to check that the variance is not time dependent. Variance

Allow u = α0/(1-α1) [i.e. u is the average for yt as shown earlier]

Var(yt) = E(yt – u)2

Set α0 = 0 (for simplicity) => u=0!

Then: Var(yt) = E(yt – 0)2 = E[(yt)(yt)],

Recall: AR(1) = MA(∞), so yt= et + α11et-1 + α1

2et-2 + …

Var(yt) = E[(et + α11et-1 + α1

2et-2 + …)(et + α11et-1 + α1

2et-2 + …)]

Continued: Multiplying this out:

Var(yt) = E(et2+ α1

2et-12 + α1

4et-22 + …+ cross products)

E(cross products) can be set =0

Left with: Var(yt) = E(et

2+ α12et-1

2 + α14et-2

2 + …) = E(et

2) + E(α12et-1

2) + E(α14et-2

2) + … Note: E(ei

2) = σ2

Var(yt) = σ2 + α12σ2 + α1

4σ2 + … = σ2(1+ α1

2 + α14 + …)

Since │α1│<1 then Var(yt) [or ϫ 0 for short! ] can be written asVar(yt) = σ2 /(1- α1

2)Hence Var(yt) is independent of time

Recall this was our second requirement for stationarity.

ϫ 0 should be gamma but powerpoints gamma symbol is like y!!!

Requirement 3: Covariance between observation k periods apart must be the same

Stationarity Conditions for an AR(1) process Now to show Covariance is independent of the time period. Covariance

ϫs = Cov(yt, yt-s) = E(yt –u)(yt-s-u)

Again Set α0 = 0 (for simplicity), u = 0 Cov(yt, yt-s) = E[(et + α1

1et-1 + α12et-2 + …)(et-s + α1

1et-s-1 + α12et-s-2 + …)]

=E(α1set-s

2 + α1s+2et-s-1

2 + …)

= α1sσ2 + α1

s+2σ2 + …

= σ2 α1s(1 + α1

2 + α14 + …)

= σ2 α1s/(1- α1

2 )

Hence the covariance, ϫ s = Cov(yt, yt-s) is time independent Recall this was our last requirement for stationarity.

Overall we have proved that for an AR(1) process yt = α0 + α1yt-1 + et

as t→∞ and │α1│<1 yt is stationary

ARIMA Models

(Can think of these as ARMA models for non-statioanry data)

An Autoregressive Integrated Moving Average (ARIMA) Process Using an ARMA(p,q) model requires that the series is

stationary. If the series is not stationary researchers typically

difference the variable as necessary and then build an ARMA model on those differenced variables.

An ARMA(p,q) model in the variable differenced d times is equivalent to an ARIMA(p,d,q) model on the original data.

[In other words if the model is ARIMA(2,1,2) it means you difference the series once and then use an ARMA(2,2) Model!]

The Box-Jenkins Approach to building an ARIMA Model(same for ARMA)

Box and Jenkins (1976) were the first to approach the task of estimating an ARMA model in a systematic manner. There are 3 steps to their approach:1. Identification2. Estimation3. Model diagnostic checking

Building ARMA Models - The Box Jenkins Approach

Step 1: Identification

Involves determining the order of the model.The Autocorrelation function (ACF) and Partial Autocorrelation Function (PACF) can be used to identify the most appropriate ARIMA specification

The Autocovariance

Cov (yt, yt-k) is knows as γk

Cov (yt, yt-k) = E(yt –u)(yt-k-u)

= E(yt,yt-k)

The Autocorrelation = Cor(yt,yt-k) is known as ρk

ρk = γk/ γ0 = Covariance / Variance [remember γ0 = Var(yt)]

Step 1: Identification

The Partial correlation Coefficient Measures the correlation between an observation k periods ago

and the current observation, after controlling for observations at intermediate lags (i.e. all lags < k).

So kk measures the correlation between yt and yt-k after removing the effects of yt-k+1 , yt-k+2 , …, yt-1 .

At lag 1, the ACF = PACF Plotting the ACF (ρk) against k and the PACF(kk) against k can

help to reveal the appropriate ARIMA specification for the data series.

The ACF of an AR(1) process

We know from earlier

γ0 = Var(yt) = σ2 /(1- α12)

γs = Cov(yt, yt-s) = σ2 α1s/(1- α1

2 )

Hence

ρ0 = γ0 / γ0 = 1

ρ1 = γ1 / γ0 = α1

ρ2 = γ2 / γ0 = α12

So in general ρs = γs / γ0 = α1s

Recall that for yt to be stationary requires that │α1│<1

If 0< α1<1 therefore ACF decays exponentially

If -1< α1<0 therefore ACF shows oscillating decay

The PACF of an AR(1) and MA(q) process

The PACF is useful for telling the difference between an AR process and anARMA process.

In the case of an AR(p), there are direct connections between yt and yt-k only for k p.

For AR(1), the PACF kk = 0 for k>1 For AR(p), the PACF kk =0 for k>p So for an AR(p), the theoretical pacf will be zero after lag p.

In the case of an MA(q), this can be written as an AR(), so there are direct connections between yt and all its previous values.

For an MA(q), the theoretical PACF will be geometrically declining.

The ACF of an MA(1) process

It can be easily shown that the ACF of an MA(q) process decays to 0 for k>q

In summary: The ACF and PACF for an AR, MA and ARMA processAn autoregressive process has

• a geometrically decaying acf• number of spikes of pacf = AR order A moving average process has• Number of spikes of acf = MA order• a geometrically decaying pacf

An ARMA (combination) process has• a geometrically decaying acf• a geometrically decaying pacf See A&H page242 table13.1

Examples:ACF and PACF plots forAR processes

ACF and PACF for a Non-stationary Model (i.e. a unit coefficient): yt = yt-1 + ut

ACF: Autocorrelations decline

towards 0 as the number of lags increases

PACF: The first partial

correlation is high but all the rest are not significantly different from 0.

ACF and PACF for a slowly decaying AR(1) Model: yt = 0.9yt-1 + ut

Because the coefficient (0.9) is close to 1, this is difficult to distinguish from the unit root process

ACF: Autocorrelations decline

towards 0 more quickly than in case of unit root

PACF: The first partial correlation

is high but all the rest are not significantly different from 0.

ACF and PACF for a more rapidly decaying AR(1) Model: yt = 0.6yt-1 + ut

ACF: Autocorrelations

decline towards 0 very quickly (within 4 periods!)

The lower the coefficient the quicker the ACF reaches 0



ACF and PACF with a negative coefficient AR(1) Model:

yt = -0.9yt-1 + ut


decline towards 0 but alternate between positive and negative values.



ACF and PACF with an AR(2) model) negative coefficient AR(1) Model: yt = 0.5yt-1 +0.25yt-2 + ut


decline towards 0. PACF:

The first and second partial correlations are different from 0

All the rest are not significantly different from 0.

Examples: ACF and PACF plots forMA processes

ACF and PACF for an MA(1) Model: yt = 0.9ut-1 + ut

ACF is significant for one lag only

PACF alternates between positive and negatives but falls to 0 as lag increases.

Recall: For the AR process it was the ACF that persisted and the PACF that was 0 after 1 lag!)

ACF and PACF for an MA(1) Model: yt = -0.9ut-1 + ut

ACF is significant for one lag only

PACF alternates between positive and negatives but falls to 0 as lag increases.

Recall: For the AR process it was the ACF that persisted and the PACF that was 0 after 1 lag!)

ACF and PACF for an MA(2) Model: yt = 0.5ut-1 - 0.25ut-2 + ut

ACF is now significant for two period The first is positive because

the coefficient on ut-1 is

positive. The second is negative

because the sign on ut-2 is

negative. PACF alternates between

positive and negatives but falls to 0 as lag increases.

Examples: ACF and PACF plots forARMA process[i.e. has AR terms and MA terms!]

ACF and PACF for an ARMA(1,1): yt = 0.5yt-1 + 0.5ut-1 + ut

Distinguishing the process from the correlogram is not as straightforward here!

The fact the ACF is different from 0 for a few period suggests an AR element

The fact the PACF is different from 0 for a few periods suggests an MA element

Recall from Earlier: Box Pierce Test [we also saw the Ljung Box Q test]

The Box-Pierce statistic tests the joint hypothesis that all ρk are simultaneously equal to zero. The test statistic is approx. distributed as a χ2 distribution with m df.

n = sample size

m = lag length

If B.P. > χ2m (α) then reject H0: ρk = 0

2

1

.. k

M

k

nPB

Step 2: Estimation of the parameters Since the model is stationary we should be able to use OLS

(provided there are no other issues)

Step 3: Model checking Box and Jenkins suggest 2 methods:

Deliberate over-fitting: if we think its ARMA(1,2) try some others like

ARMA(2,3), ARMA(2,2) etc. residual diagnostics

Building ARMA Models - The Box Jenkins Approach (cont’d)

• Identification would typically not be done using ACF and PACF. Rather they are used to identify starting points• We want to form a parsimonious model.• This gives motivation for using information criteria, which embody 2

factors• a term which is a function of the RSS• some penalty for adding extra parameters

• The object is to choose the number of parameters which minimises the information criterion. Use the AIC and SBC.

• To properly compare AIC and SBC across models the AIC and SBC should be based on model estimations using the same number of observations.

Some More Recent Developments in ARMA Modelling

Diagnostic Checking

Check the residuals for serial correlation using the LM test described earlier in the course

Test the residuals for normality. More on this later!

That’s all for today!Some more advanced info on ARIMA is available at: http://www.stats.uwo.ca/faculty/aim/vita/pdf/Advances1.pdf

But not necessary for this course.

As usual there are some questions on Blackboard for next weekWe’re halfway through the course now and things will be speeding up so if you haven’t kept up to date with the questions it will become tougher

So far only ½ the class have sent me any….

Documents

ARIMA AR MA ARMA Models