CHAPTER 3 STATIONARY LINEAR TIME SERIES ...baillie/822/stationary.time.series...This chapter is concerned with the theory of stationary, univariate time series observed at discrete

1

CHAPTER 3

STATIONARY LINEAR TIME SERIES PROCESSES

© Richard T Baillie, November, 2003

Introduction

This chapter is concerned with the theory of stationary, univariate time series observed at

discrete intervals of time. An understanding of these basic models is necessary to appreciate

more complicated dynamic econometric models that will be developed later. The time domain

population characteristics are derived for various univariate time series processes, including the

autocovariance, the autocorrelation function and the Wold decomposition. The emphasis here is

to provide the basic concepts and techniques, which are essential for all the models to be

developed later. There will inevitably be quite a lot of algebraic methods and tricks that are

worth knowing for the future.

A time series process ty is assumed to measure a quantitative variable, associated with an underlying ordered

Hence this theory is designed to deal with economic and financial series such as quarterly GNP,

monthly inflation, weekly money supplies, daily exchange rates, hourly IBM stock prices, etc.

Clearly, the methodology and issues are relevant to any science where the analysis of time series

data is important; and many if not most subjects use such data.

Stationarity

A time series is considered to be generated as a realization of the stochastic process .

( ) : 0, 1, 2,...y t t = ± ± The time series process is said to be Strongly Stationary, or Strictly

Stationary, if the joint distribution of the set of random variables 1 2

, ,..... st t ty y y is the same as

2

the joint distribution of set 1 2

, ,...... st k t k t ky y y+ + + for all s tuples 1 2( , ,..... )st t t and k of integers.

Hence the joint distribution of the process is independent of time, and only depends on the

intervals between the time points and not on the location of the time points relative to the time

origin.

If the joint distribution of 1 2

, ,..... st t ty y y is multivariate Normal, then ty is said to be a

Gaussian process and is fully defined by its mean vector and covariance matrix. In this case, if

the ( )tE y µ= and is constant and also if ( )t t kCov y y − is only a function of the lag between the

observations, i.e. k, then the process can be shown to be strictly stationary.

Although of considerable theoretical interest, strict stationarity is generally of limited use as a

theoretical construct and is also not readily testable from an empirical realization of a time series.

For this reason, other ways of describing time series have to be considered.

In particular, a time series process is said to be Weakly Stationary, or Covariance

Stationary if: (i) the mean, ( )tE y is constant, (ii) the variance, ( )tVar y ) is constant; and (iii) the

autocovariance function ( )t t kCov y y − is only a function of the lag k and is independent of time.

In which case the autocovariance function is expressed as,

(1) Cov( )k t t ky yγ −= k = 0,1,2,...

(2) ( ) ( ) ( )k t t k t t kE y y E y E yγ − −= −

and turns out to be an extremely important way to characterize the behavior of a theoretical time

3

series process. The empirical counterpart from a sample realization is equally important for ana-

lyzing empirical data.

The Autocovariance Function, ,kγ is equivalent to the usual covariance operator, and

measures the degree of association between the process at time t and at k lags in the past. Hence,

the autocovariance function measures the time dependence, or internal structure of the memory

of the process.

Analogously to regular statistical work, the covariance function is more usefully replaced by

the correlation coefficient, which is independent of the scale of measurement. Hence the Auto-

correlation Function of lag k is defined as,

(3) 12

( ) , 0,1, 2, ,[Var( )Var( )]

t t kk

t t k

Cov y y ky y

ρ −

−

= = …

where 0ρ = 1 and 1kρ ≤ for integer k. For a weakly stationary process with

0( ) ( )t t kVar y Var y γ−= = , then

(4) 0

0,1, 2, ,kk kγ

ργ

= = …

Also, for all real valued processes yt, the autocovariance function is symmetric, so that kγ = kγ −

and kρ = .kρ− A graph of the autocorrelation function kρ on the vertical axis and the lag k on

the horizontal axis, is known as a Correlogram.

4

The concept of weak stationarity implies that for any realization of a stationary time series

process, different pieces or sections of the series, will have identical data generating process, but

numerical values of the sections of the realization will only differ due to the presence of different

random innovations. An important corollary of stationarity is that

(5) 0 ,k kLim ρ→∞ =

so that the degree of dependency, or memory, of the time series process, goes to zero as the lag

between the observations increases .It should be noted that a series of i.i.d. Cauchy random

variables are strongly but not weakly stationary; while a series of random variables with mean of

zero, variance of unity, but different fourth moments is weakly but not strongly stationary. For

most practical purposes the concept of weak stationarity is the definition that will be used.

Ergodicity:

A stationary process is ergodic if there is a tendency to independence as the lag between

observations increases. Hence ergodicity implies that as k →∞ , then ( ) 0t t kCov y y − → , so there

is limited memory. The ergodic property is especially important for establishing consistency of

sample estimates of their corresponding population quantities.

3. Linear Time Series Processes

A first building block of stationary time series processes, is the important concept of White

Noise. The simplest type of discrete time series process is described as white noise and is generally

5

represented by the random variable .tε In all the models, the concept of white noise plays a

crucial role in the generation of the process. White noise is also known as an innovation, shock

or disturbance term of an underlying time series process. There are various definitions, which

require different strength of assumptions.

Weak Definition of White Noise: The variable tε is defined to have a zero mean, have a

constant variance 2σ and is serially uncorrelated, so that

( ) 0,tE ε =

2Var( )tε σ=

and

( ) 0k t t kCovγ ε ε −= = for 0.k ≠

It is important to note that the weak assumption of white noise merely requires the tε process to

be zero mean, serially uncorrelated and that its unconditional variance is constant; i.e.,

homoskedastic.

Martingale Definition of White Noise: The variable tε is defined to be unpredictable in its

conditional mean, so that

(6) 1 1( ) 0t t t tE Eε ε− −Ω = = ,

where 1t−Ω is a sigma field or information set available at time t – 1. The martingale assumption

6

implies a zero conditional mean and also that the innovations, or white noise process is serially

uncorrelated.

Strong Definition of White Noise: The distribution of the variable tε is defined, so that

2~ . . .(0, )t i i dε σ ,

and sometimes the precise nature of the distribution is specified, such as 2~ (0, )t NIDε σ .The

white noise process acts as the "forcing term" in more complicated time series models, such as

ARMA, to be considered later. White noise is also commonly referred to as a disturbance or

error in regression analysis, and as an innovation, or random shock in macroeconomics.

The Wold Decomposition

The famous Wold Decomposition first appeared in the book by Herman Wold (1954), "A

Study in the Analysis of Stationary Time Series". Wold showed that any stationary time series

process can be uniquely represented as the sum of two mutually uncorrelated processes tξ and

,tη where tξ is a moving average process of infinite order, and tη is a purely deterministic

process, such as a sine wave with fixed period. Then,

(7) ,t t ty ξ η= +

7

where,

(8)0

( ) ,t j t j tj

Lξ ψ ε ψ ε∞

−=

= =∑

and 0

( ) jj

jL Lψ ψ

∞

=

=∑ with 0 1,ψ = ( ) 0,tE ε = 2 2( ) ,tE ε σ= ( ) 0,t sE ε ε = for s # t and 20 jj ψ∞

=∑ <

,∞ which is known as the squared summability condition.

If ty is a sequence of random variables such that sup ,t tE y < ∞ and if jj ψ∞

=−∞∑ < ,∞

then yt is said to converge absolutely with probability one. If 2sup ,t tE y <∞ then ty is said to

converge in mean square to the same limit. Brockwell and Davis (1988), pages 83-84 provide

more detail on the probability aspects of convergence of the ty process.

It most econometric work it is usual to neglect the second term of equation (7) on the

grounds that purely deterministic components such as fixed period harmonics are very unlikely

to occur in economics and finance applications. However, their presence is sometimes regarded

as necessary in applications in electrical engineering, oceanography and cardiac rhythm data,

where regular cycles are expected to occur. In all the following material , it is assumed that a

stationary time series process can be completely represented by the infinite moving average

process. The Wold decomposition is therefore in terms of an Infinite Moving Average

Representation, which in recent econometric terminology, particularly in macroeconomics, is

named the Impulse Response Weights.

If a time series possesses an infinite order moving average representation, from the Wold

8

Decomposition, then it will also have a corresponding Infinite Autoregressive Representation

given by

(9) ( ) t tL yπ ε=

where 1

( ) 1 jj

jL Lπ π

∞

=

= −∑ . The representations ( )t ty Lψ ε= and ( ) t tL yπ ε= are mutual inverses

since

(10) ( ) ( ) 1.L Lπ ψ ≡

There are several useful algebraic tricks to getting from one representation to another.

Subsequent examples will illustrate some of the more widely used methods. It is important to

note that the Wold decomposition implies the use of either the infinite order moving average, or

the infinite autoregressive representations to describe any stationary time series process.

However, from a practical perspective a lot of time series analysis attempts to replace these

infinite order representations with finite parameter models, which describe some salient features

of the process. The simplest first order processes are:

Autoregressive of order one, denoted as AR(1) ,

1 .t t ty yφ ε−= +

Moving Average of order one, denoted as MA(1),

9

1 ,t t ty ε θε −= −

and the combined Autoregressive Moving Average, denoted as ARMA(1,1),

1 1 .t t t ty yφ ε θ− −= + −

Higher order processes, i.e. involving further lagged terms are discussed in a later section.

Clearly, autoregressive models are closely related to regressions where lagged dependent vari-

ables fulfill the role of explanatory variables. Models with moving average terms are like regres-

sions with unobservables as explanatory variables. The most general linear time series process

considered in this section is the ARMA(p,q) model,

(11) 1 1 1 1t t p t p t t q t qy y yφ φ ε θ ε θ ε− − − −= + + + − − −… …

It turns out that quite low order ARMA models are capable of representing a wide variety of

behavior of time series processes. The ARMA(p,q) process can be expressed in lag operator form

as,

(12) ( ) ( ) ,t tL y Lφ θ ε=

where,

(13) 21 2( ) (1 )p

pL L L Lφ φ φ φ= − − − −… and 21 2( ) (1 ).q

qL L L Lθ θ θ θ= − − − −…

10

It is also frequently convenient to use infinite order processes, namely the infinite order moving

average representation,

(14) 0

,t j t jj

y µ ψ ε∞

−=

= +∑ with 0 1,θ =

and the infinite order autoregressive process,

(15) 1

.t j t j tj

y yµ π ε∞

−=

= + +∑

The infinite order moving average process is thus a linear combination of past uncorrelated inno-

vations tε terms and the weights jψ represent the relative importance of previous shocks, or

innovations. Both representations are linear in parameters and in terms of past ' .ty s . Also both

representations depend on an infinite number of parameters, sometimes referred to as the ψ

weights and π weights. In both the above representations the process is allowed to have a

possibly non-zero mean .µ In many cases µ is omitted on the understanding that the stationary

time series process ty has been expressed in deviation form.

The Finite Moving Average Process: MA(q)

The model is based on the notion that the infinite moving average representation of a station-

ary time series ty can be truncated after q lags. The model was originally proposed by Yule in

11

the 1930's, well before the work of Wold in 1954 on the general decomposition of stationary

time series into infinite moving averages and harmonic components. The general moving aver-

age process of order q, which is denoted by MA(q) is,

(16) 1 10

( ) ,t t t q t q j t j tj

y Lε θ ε θ ε θ ε θ ε∞

− − −=

= + + + = =∑…

where ,0( ) qjjLθ θ

==∑ 0 1,θ = jθ are real numbers and 2

0 .qjj θ

=<∞∑ An important special

case is the MA(1) process,

(17) 1 ,t t ty ε θε −= +

The variance, or autocovariance at lag zero is defined as,

2 20 1

2 2 2 21 1

( ) ( ) ,

( 2 ) (1 ) .

t t t

t t t t

E y E

E

γ ε θε

ε θε θε ε θ σ

−

− −

= = +

= + + = +

Similarly,

1 1 1 1 2

2 2 2 21 2 1 1 2

( ) ( )( )

( ) .

t t t t t t

t t t t t t t

E y y E

E

γ ε θε ε θε

ε ε θε ε θε θε ε θσ

− − − −

− − − − −

= = + +

= + + + =

Also,

2 2 1 2 3

22 3 1 2 1 3

( ) ( )( )

( ) 0.

t t t t t t

t t t t t t t t

E y y E

E

γ ε θε ε θε

ε ε θε ε θε ε θ ε ε

− − − −

− − − − − −

= = + +

= + + + =

12

Similarly, 0kγ = for 2.k > The autocorrelation function for the MA(1) process is then

(18) 0 1,ρ = 1 2(1 )θρθ

=+

and 0kρ = for 2.k ≥

Frequently the autocorrelation function is graphed against its lag k; such a graph is called a

correlogram. For the MA(1) process it takes the form given in figure 1. The autocorrelation

function for the MA(q) process is,

(19) 0

,q

t j t jj

y θ ε −=

= ∑

where 0 1θ = and since ( ) 0,tE ε = it follows that ( ) 0.tE y = The variance of the MA(q) process

is given by

(20) 2 2 20

0( )

q

t jj

E yγ σ θ=

= = ∑

0 0

0 1 1 2 2 1 1

0 1 1 2 2

20 1 1 2 2

( )

( )

x ( )

( ).

q q

k t t k i t i r t k ji j

t t t k t k k t k q t q

t k t k t k q t q k

k k k k q q

E y y

E

γ θ ε θ ε

θ ε θ ε θ ε θ ε θ ε θ ε

θ ε θ ε θ ε θ ε

σ θ θ θ θ θ θ θ θ

− − − −= =

− − − + − − −

− − − − − − −

+ + −

= =

= + + + + + + +

+ + + + +

= + + + +

∑ ∑

… …

… …

…

13

2

0

q k

k i i ki

γ σ θ θ−

+=

= ∑

Then

(21) 0

2

0

,

q k

i i ki

k q

ii

θ θρ

θ

−

+=

=

=∑

∑0,1, ,k q= …

0, 1k k qρ = ≥ +

The above illustrates an important property of the finite order MA(q) process; namely that its

"memory" only lasts for q periods or lags. Hence the autocorrelation function is zero after q+1

lags, so that the process is independent of its behavior q+1 periods ago. For example, in the

MA(2) process,

(22) 1 11 24 8( ) ( ) ,t t t ty ε ε ε− −= + −

it follows that: 0 1,θ = 11 4 ,θ = and 1

2 8 .θ =− Then from the more general results,

2 2691 10 16 64 64(1 ) ( )γ σ σ= + + =

1

2 2 271 1 11 1 1 1 2 4 4 8 32

0( ) ( ) ( )i i

iγ θ θ θ θ θ σ σ σ+

=

= = + = − =∑

2 212 2 8γ θ σ σ= = −

0 ,kγ = for 3k ≥

14

Hence, ( ) ( )7 69 141 32 64 69/ ,ρ = = 8

2 69 ,ρ = − 3 4 .... 0.ρ ρ= =

Invertibility of MA Processes

The general finite order MA(q) moving average process given by (19) will always be

stationary provided all the coefficients are finite and 2 .jθ < ∞∑ However, the issue of

invertibility arises with moving average processes and is concerned with the identifiability of the

MA process. The moving average process (19) is said to be invertible if all the roots of ( )Lθ lie

outside the unit circle. In general, 2q different MA(q) processes will be observationally

equivalent in the sense they will possess identical autocorrelation structures. For example, the

MA(1) processes,

1 (1 )t t t ty Lε θε θ ε−= + = +

and

11 11 ,t t t ty Lε ε εθ θ− = + = +

both have the same autocorrelation structure of 21 /(1 )ρ θ θ= + and 0jρ = for 2.j≥ Since

21 1 0 ,θ ρ θ ρ− + =

15

it is clear that two values of θ can be obtained which are consistent with this equation; however

only one will lie outside the unit circle and be invertible. If 1 .4ρ = then

2 2.5 1 0

( .5)( 2) 0

θ θ

θ θ

− + =

− − =

so that .5θ = is the invertible solution, while 2θ = is the noninvertible solution. Both these

MA(1) processes have the same autocovariance structure.

More generally for the ARMA(p.q) model ( ) ( ) ,t tL y Lφ θ ε= the infinite autoregressive repre-

sentation ( ) t tL yπ ε= has to converge in some sense for invertibility. The usual requirement is

for 0 jj π∞

=< ∞∑ and a sufficient condition is for all the roots of ( )Lπ to lie outside the unit

circle. This restriction is somewhat arbitrary but is very convenient for avoiding model multipli-

city. The infinite order moving average representation ty = ( ) tLψ ε for stationary ty then gives

rise to the infinite autoregressive representation from the relationship 1( ) ( ) ,L Lπ θ −= so that

lim 0.j jπ→∞ = Hence there is the sensible requirement that the weight on observations a long

time ago to decline as the lag length increases.

The Autoregressive Process of Order of 1: AR(1)

Probably the most widely used model in time series and dynamic econometric work is the

first order autoregression, which is very useful for describing many of the issues in dynamic

models. The process, denoted by AR(1) is,

(25) 1 .t t ty yφ ε−= +

16

By successive substitution, the process can be expressed as

21 2 ,t t t ty yε φε φ− −= + +

and then

2 31 2 3

2 11 2 1

,

.

t t t t t

k kt t t t k t k

y y

y

ε φε φ ε φ

ε φε φ ε φ ε φ

− − −

+− − − − −

= + + +

= + + + +…

For stationarity it is necessary, 2ty = 2( )tE y = constant.

22 22( 1)

10

0 ,k

j kt t t j t k

jy y yφ ε φ +

− − −=

= − = →∑

as ,k→∞ in which case the process ty is said to be convergent in the Mean Square (MS) sense.

A sufficient condition for this is that 1.φ < Hence 0j

t jj φ ε∞−=∑ is mean square convergent and

ty = 0j

t jj φ ε∞−=∑ is valid in the mean square sense and with probability one. It is interesting to

examine the properties of the process when 1φ > and consequently lies in the non-stationary

region. In probability terms, the process does not converge in 2 .L However, the AR(1) process

can still be written as

17

1 1

2 1

1 2 1

1 1

1 1 1 1 .

t t t

k k

t t t k t k

y y

y

εφ φ

ε ε εφ φ φ φ

+ +

+

+ + + − −

= − +

= − − + − +

…

However, this process does not appear to be sensible since there does not seem any physical way

ty can be influenced by future random innovations. Hence φ is restricted to lie in the interval

1.φ < From the representation,

2 11 2 1. .k k

t t t t t k t ky yε φε φ ε φ ε φ +− − − − −= + + + + +…

the process can be expressed as ,

(26) 2 11 2 1 0. .t t

t t t ty yε φε φ ε φ ε φ−− −= + + + + +…

The initial value can be assumed fixed at 0 0y = , or the process can be assumed to have been

generated for very large number of periods. Hence the effect of the initial observation is

negligible. This shows that the infinite moving average representation, or impulse response

weights are given by,

(27) 0

tt t j

jy φ ε

∞

−=

= ∑

18

so that the impact of a shock j periods ago has coefficient of jφ and for stationary process,

1 1φ− < < , the impact of the shock is seen to die away at an exponential rate the further back in

time is examined. It can also be seen that ( ) 0,tE y = while the variance of ty can be found as,

2 2 21 2

22 4 2 2

2

( ) [ ]

(1 ) .(1 )

kt t t t t k

k

E y E ε φε φ ε φ ε

σφ φ φ σφ

− − −= + + + +

= + + + + + =−

… …

… …

The autocovariance function be similarly found as,

( )

2 1 21 2 1 1 2

22 4 2 2

2

( ) [ ] [ ]

(1 )1

k kk t t k t t t t k t k t k t k t k

kk k

E y y E xγ ε φε φ ε φ ε φ ε ε φε φ ε

σ φφ φ φ φ σφ

+− − − − − − − − − − −= = + + + + + + + +

= + + + + + =−

… … …

… …

The autocorrelation function is then

(28) 0

,kkk

γρ φ

γ= =

so that all the main properties of the process, i.e. infinite order moving average representation

weights, ,kψ and autocorrelation decay at an exponential (geometric) rate.

At the stage it is intuitively interesting to examine the case of 1φ = ; the so called unit root, or

random walk model for .ty Clearly the impulse responses in equation (28) will not decay and the

19

role of all past innovations or shocks are the same as the last periods weight. It should also be

obvious that the variance of the process is undefined when 1φ = , which indicates the need for

different types of investigation of this process. Figures 1 through 3 show sample realizations for

2,000 observations generated by AR(1) models for 0.3,φ = 0.6φ = and 0.9.φ = Clearly, as the

value of the autoregressive parameter increases the series takes on the appearance of having less

random structure and there becomes evident distinct patterns with one high observation being

likely to be followed by a further high observation. Similarly, a small observation is more likely

to be followed by a further small observation. Figure 4 shows a similar realization from the

AR(1) process with 1φ = . This process is now a unit root process, 1t t ty y ε−= + and clearly has

quite different behavior to the stationary AR(1) model with 1 1φ− < < . The series realization is

marked by slow drifts in the mean of the series through different levels and is visually entirely

different to the stationary AR model. The unit root process is non stationary and will be

discussed in more detail later.

The AR(p) Process

The autoregressive process of order p, is denoted by AR(p) and represents the current value

of the process as a linear combination of the last p lagged values of the process. The AR(p)

process can be regarded as a p'th order linear difference equation with the addition of a stochastic

disturbance, ,tε

(29) 1 1 2 2 ,t t t p t p ty y y yφ φ φ ε− − −= + + + +…

20

Or,

(30) ( ) ,t tL yφ ε=

where 1( ) (1 )ppL L Lφ φ φ= − − −… and tε is white noise. When analyzing and using the AR(p)

model, it is very important to know the following:

(i) under what conditions will ty be stationary?

(ii) what is the autocorrelation function of ?ty

(iii) will ty have a unique moving average representation?

All of the above issues can be answered from considering the solution of the difference equation

( ) 0.tL yφ = On taking the auxiliary equation ( ) 0mφ = and assuming that ( )mφ has p distinct

roots 1 ,ξ 2 , .pξ ξ… Then

(31) 1

( ) 1p

ii

mmφξ=

= −

∏

so that the general solution for ty is given by

(32) 1

1 ,ti

t iii

y cξ=

=

∑

21

where the ic are unknown constants to be determined from initial boundary conditions. For ty to

be stationary, any solution for ty must be stable and independent of t. Thus it is necessary for 1/ 1jξ <

and hence 1jξ > for all j. With this condition it then follows that lim (1/ ) 0,tj jξ→∞ = which

ensures that ty is stationary. The condition is generally expressed by saying that all the roots of

( )Lφ must lie outside the unit circle. The Particular Solution for ty is given by

(33) 1

1

1( ) 1 ,

p

t tii

LLφ ε εξ

−−

=

= −

∏

On expanding as partial fractions realizes,

11

1( ) 1 ,

p

t i ti i

LL aφ ε εξ

−

−

=

= −

∑

and since each 1,iξ > it follows that each binomial expansion 11 ( / )iL ξ −− will be valid and

will converge. On collecting terms,

(34) 1

11

1 0 0( ) 1 ( ) .

pi

t i t i t i ti i ii

LL a Lφ ε ε ψ ε ψ εξ

−∞ ∞

−−

= = =

= − = =

∑ ∑ ∑

The full solution for ty is then obtained by adding the particular solution to the general solution

to obtain,

22

(35) 1 1

1 ,tp

t i i t ii ii

y c ψ εξ

∞

−= =

= +

∑ ∑

and since all the roots of ( )Lφ lie outside the unit circle, the p exponential terms in (35) will

vanish for large t to leave the solution as the Infinite Moving Average Representation from the

Wold Decomposition, 0

.t i t ii

y ψ ε∞

−=

= ∑ In general, the following three conditions are equivalent

for the AR(p) process; each condition implies the other two:

(a) ty is stationary

(b) ty has a unique infinite order moving average representation

(c) all the roots of ( )Lφ lie outside the unit circle

Autocorrelation Function of the AR(p) Model

First, it is natural to impose the condition that

(36) ( ) 0t t kE yε − = for 1 ,k ≥

since a future realization of the random white noise process tε must be uncorrelated with past

realizations of the process .ty However,

(37) ( ) ( ) ( ) ( )21t t i t t p t t p tE y E y E y Eε φ ε φ ε ε− −= + + +…

23

so that ( ) 2 .t tE yε σ= On multiplying successively through ty by 1 ,ty − , 2 ,ty − … t py − and on

taking expectations, the following p equations are obtained,

1 1 0 2 1 1

2 1 1 2 0 2

1 1 2 2 0

,

,

.......................................

,

p p

p p

p p p p

γ φ γ φ γ φ γ



−

−

− −

= + + +

= + + +

= + + +

…

…

…

which are known as the Yule-Walker Equations. On noting that ,j jγ γ −= , the general equation

can be expressed as

(38) ( ) 0kLφ γ = 1,2,k p= …

and on dividing by 0 :γ

(39) ( ) 0kLφ ρ = 1, 2,k p= …

Since ty is stationary, all the roots of ( )Lφ lie outside the unit circle and jρ satisfies the same

difference equation as .ty If all the ,jξ which are the roots of ( )Lφ are real and distinct, then

24

(40) 1

1 ,kp

k ii i

aρξ=

=

∑

where the ia are unknown constants to be determined by solving the first p Yule-Walker equa-

tions as boundary conditions. In this case kρ is simply the sum of geometrically decaying terms;

i.e. since ( )1/ 0kiε → as ,k→∞ the autocorrelations decay as the lag increases, so that the

stationarity condition in equation (7) is satisfied..

One possibility is that a pair of roots iξ and iξ may be complex conjugates and will jointly

contribute a term of the form,

(41) ( )sin 2jic d j fπ +

to the autocorrelation function .jρ This term will be a damped harmonic, with f as the frequency

and d as the damping factor. One further possibility, which is relatively unlikely, is that two

roots are the same. This will contribute a term to kρ of the form:

(42) ( )21 .

k

ii

a a kξ

+

The theoretical autocorrelation functions of higher order AR(p) models with a combination of

real and complex conjugate roots in their autoregressive polynomial operators, will clearly be of

25

a complicated form and involve a combination of real and complex roots.

Infinite Moving Average Representation of the AR(p) Model

For the stationary AR(p) model ( ) ,t tL yφ ε= there exists the unique infinite order moving

average representation,

(43) 0

( ) .t j t j tj

y Lψ ε ψ ε∞

−=

= =∑

Since 1( ) ( ) ,t t ty L Lφ ε ψ ε−= = it follows from equation (10) that 1( ) ( ) ,L Lφ ψ− = and therefore

there is the following inverse relationship between the lag polynomials ( ) ( ) 1.L Lφ ψ = The most

direct way of finding the implied infinite moving average representation weights given

knowledge of the AR parameters is to solve recursively by writing it as:

(44) ( )( )2 21 2 1 21 1 1.p

pL L L L Lφ φ φ ψ ψ− − − − + + + ≡… …

Then on equating powers of L:

1 1 0ψ φ− = (coefficient of L)

2 1 1 2 0ψ φψ φ− − = (coefficient of L2)

3 1 2 2 1 3 0ψ φψ φ ψ φ− − − = (coefficient of L3)

and in general

26

1 1 2 2 0k k k p k pψ φψ φ ψ φ ψ− − −− − − − =… k p≥

or

(45) ( ) 0kLφ ψ = k p≥

so that the infinite moving average representation weights kψ satisfy the same difference equa-

tion as ty and .jρ On solving for kψ it is possible to obtain

1

1kp

k iii

bψξ=

=

∑

where b1, b2, … bp are to be determined from initial boundary conditions from directly solving

for the first p moving average representation coefficients. This is most easily seen from the

following examples.

Reanalysis of the AR(1) Process:

There are many tricks in the use of lag polynomials which can simplify the derivation of the

properties of time series models and these can be used interchangeably as is convenient. The

previous method for the derivation of the aspects of the AR(1) process can be simplified as the

following indicates. The first order autoregressive process, or AR(1) process is,

(46) 1 ,t t ty yφ ε−= +

27

Hence ( )Lφ = (1 ) ,Lφ− which has a root of 1/ .ξ φ= The condition for stationarity is that the root

must lie outside the unit circle, that is 1,ξ > which implies that 1.φ < On multiplying through

the model for ty by ,t ky − taking expectations and using the fact that ( )t t kE yε − = 0, for 1k ≥

gives,

1k kγ φγ −= 1k ≥

Multiplying through the AR(1) by ty and taking expectations gives

20 1 .γ φγ σ= +

But 1 0γ φγ= also, so that 2 20 0γ φ γ σ= + and 2 2

0 /(1 ).γ φ φ= − Hence

2

2(1 )

k

kφ σγ

φ=

−

and

(47) .kkρ φ=

The infinite moving average representation for the AR(1) process can be directly found as,

( ) ( )1 2 21 1t t ty L L Lφ ε φ φ ε−= − = + + +…

hence

28

(48) 0 0( ) .j j j

t t t jj j

y Lφ ε φ ε∞ ∞

−= =

= =∑ ∑

Alternatively, since ( ) 0kLφ ψ = for ,k p≥ then

( )1 0 ,kLφ ψ− = 1k ≥

1 ,k kψ φψ −= 1k ≥

and since 0 1,ψ = it follows that .kkψ φ= Hence for the AR(1) process, the infinite moving aver-

age representation weights and the autocorrelations both decline at an exponential rate.

The AR(2) Process

The general AR(2) process is given by,

(49) 1 1 2 2t t t ty y yφ φ ε− −= + +

For stationarity the roots of 21 2( ) (1 )L L Lφ φ φ= − − must lie outside the unit circle, which can be

shown to imply that,

(50) 1 2 1;φ φ+ < 2 1 1φ φ− < and 2 1 .φ <

The first two Yule-Walker equations are:

29

1 1 2 1

2 1 1 2 ,

ρ φ φ ρ

ρ φ ρ φ

= +

= +

which give

11

2(1 )φρφ

=−

and 2

2 12

2(1 )φ φρ

φ+

=−

and the general Yule-Walker equation is

(51) 1 1 2 2 ,k k kρ φ ρ φ ρ− −= + 2k ≥

An AR(2) Process with Real Roots

As an illustration of the above theory, consider the AR(2) process

(52) ( ) ( )311 24 8 ,t t t ty y y ε− −= + +

( ) ( ) ( ) ( )23 31 14 8 4 2( ) 1 1 1 0L L L L Lφ = − − = − + =

which has roots of 43 and -2. Since both roots are outside the unit circle it follows that ty is

stationary. From the first Yule-Walker equation

311 14 8 ,ρ ρ= +

30

it then follows that 21 5ρ = and, in general,

(54) 311 24 8 ,k k kρ ρ ρ− −= + for 2k ≥

Then,

( ) ( )3 14 2

k kk A Bρ = + − ,

where A and B are unknown constants. However, 0 1 A Bρ = = + and ( ) ( )32 11 5 4 2 ;A Bρ = = −

so that 1825A = and 7

25.B = Hence the full solution for the autocorrelation function is given by,

(55) ( ) ( )18 3 7 125 4 25 2

k kkρ = + − k = 0, 1, 2,…

The autocorrelations can now be calculated either directly from the above formula or from the

difference equation for .kρ To find the form of the infinite moving average representation

weights:

( ) 0kLφ ψ =

1 1 2 2k kψ φψ φ ψ− −= + 2k ≥

By direct substitution it is simple to determine the first few MA coefficients;

31

( )3 31 12 3 1 24 4 8 8

7 3 12 3 116 32 4 ,

t t t t t t

t t t t

y y y y

y y

ε ε

ε ε

− − − −

− − −

= + + + +

= + + +

so that 11 4ψ = and 0 1,ψ = which gives two initial conditions. Since the impulse response

weights follow the same difference equation as the autocorrelation coefficients, it follows that

( ) ( )3 14 2

k kk A Bψ = + −

and hence 1 = A + B and ( ) ( ) ( )31 14 4 2 ,A B= − which implies A = 3

5 and B = 25 . Hence

(56) ( ) ( )3 3 2 15 4 5 2

k kkψ = + − k = 0, 1, 2, …

An alternative, and in this instance, more algebraically involved approach, is to expand 1( )Lφ − in

series of ascending powers of the lag operator L, to give

( ) ( )

( ) ( )

11 2314 8

13 1

4 2

( ) 1

1 1

t t t

t

y L L L

L L

φ ε ε

ε

−−

−

= = − −

= − +

Then by the use of partial fractions,

( ) ( ) ( ) ( )

1 13 3 2 15 4 5 2

3 3 2 15 4 5 2

0

1 1

,

t

j jt j

j

L L ε

ε

− −

∞

−=

= − + +

= + −∑

32

which is identical to equation (56) as before. The first few values of the autocorrelation function and infinite mov

Lag 0 1 2 3 4 5 6 7

jρ 1.00 0.40 0.48 0.27 0.25 0.16 0.13 0.09

jψ 1.00 0.25 0.44 0.20 0.21 0.13 0.11 0.06

The Variance of an AR(2) Process

From solving the first two Yule-Walker equations appended with a similar equation obtained

by multiplying through the process by ty and taking expectations. Then

20 1 1 2 2

1 1 0 2 1

2 1 1 2 0

γ φ γ φ γ σ

γ φ γ φ γ

γ φ γ φ γ

= + +

= +

= +

Then from the second Yule Walker equation,

20 1

1

1 φγ γφ

−=

while 21 1 0 2 2φ γ γ φ γ σ= − − and on substituting for 2γ from gives

2 21 1 0 1 2 1 2 0φ γ γ φ φ γ φ γ σ= − − −

hence

33

( )( )

2 22 0

11 2

1

1

φ γ σγ

φ φ

− − =+

and substituting into the first Yule Walker equation,

( )( ) ( ) ( )2 2 10 2 2 0 2 2 21 1 1 1γ φ φ γ φ σ φ φ = − − − − +

hence

(59) ( )

( ) ( ) 2

20 2 2

2 2 2

1

1 1

φ σγ

φ φ φ

− = + − −

In the previous example 11 4 ,φ = 3

2 8φ = and ( ) 23200 231γ σ= = 21.3853 .σ Alternatively, the

same result can be derived from the infinite moving average representation,

( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

( )( ) ( )( ) ( )( )

22 2 23 3 2 1

0 5 4 5 20 0

2 2 29 3 34 1 1225 4 25 2 25 8

0 0 0

29 16 84 4 1225 7 25 3 25 11

2

,

,

,

1.3853 .

j jj

j j

j jj

j j j

γ σ ψ σ

σ

σ

σ

∞ ∞

= =

∞ ∞ ∞

= = =

= = + −

= + − + −

= + +

=

∑ ∑

∑ ∑ ∑

An AR(2) Process with Complex Roots

A further numerical example is an AR(2) process with complex roots,

34

(60) ( )11 22 ,t t t ty y y ε− −= − +

where 212( ) 1L L Lφ = − + has roots of 1 ± i, which are complex conjugates. Recall that the

modulus, or absolute value of a complex number ,a bi+ is given by a bi+ = 2 2 ;a b+ and a

modulus greater than one implies a root outside the unit circle. For the process under

consideration both roots lie outside the unit circle, so the process is stationary. From the Yule-

Walker equations:

(61) ( )11 22 ,k k kρ ρ ρ− −= − 2k ≥

and

( )11 121 ,ρ ρ= −

so that 21 3ρ = and 0 1.ρ = On using de Moivre's theorem,

(62) cos( ) sin( )ie iθ θ θ= +

the inverses of the two complex roots (1+i) and (1-i), can be expressed as 11/ξ = ide θ and

21/ξ = ,ide θ− where d = ( )1

22φ− and is known as the damping factor and indicates the degree of

decay of the harmonic cycle in the autocorrelation function. Also on writing,

( )1

2

1

2

cos( )2

φθφ

= −

35

and

( )2

2(1 )tan( ) tan( ) ,1

dd

ω θ + = −

the autocorrelation function can then be expressed more conveniently as,

(63) ( )( )

sin(

sin

k

k

d kθ ωρ

ω

+ =

Hence kρ takes the form of a damped harmonic. A detailed proof of the above result is given by

Box and Jenkins (1970; pp. 58-63). In the above example 1 1φ = and 12 2 ,φ =− so that d =

121/(2) , cos( )θ =

121/(2) and hence / 4.θ π= Also, tan( )Φ = 3[tan( )] 3θ = and

(64)

( ) ( )

( )( ) ( )

( )

( )( )

2

2

12

2

12 1

1

12

12

sintan (3)

4

sin tan (3)

sin.3973

4

3 10

1.0541 sin .39734

k

k

k

k

k

k

k

π

ρ

ππ

π π

−

−

+ =

+ =

= +

The first few values of the autocorrelation function are tabulated below:

36

Lag

(j) 0 1 2 3 4 5 6 7 8 9 10 11 12

jρ 1.00 .67 .17 -.17 -.25 -.17 -.04 .04 .06 .04 .01 -.01 -.01

The Autoregressive Moving Average, or ARMA Process

The AR(p) model when appended with a MA(q) error structure realizes the ARMA (p,q)

model, which is one of the most widely used models to represent a stationary time series. The

ARMA(p,q) model is defined as,

(65) 1 1 1 1 ,t t p t p t t q t qy y yφ φ ε θ ε θ ε− − − −− − − = − −… …

Or,

( ) ( ) .t tL y Lφ θ ε=

It is assumed that all the roots of ( )Lφ and ( )Lθ lie outside the unit circle so that the stationarity

and invertibility conditions are satisfied. It also assumed that ( )Lφ and ( )Lθ do not share any

common factors.

1

1( )p

ii

LLφξ=

−=

∏

and

37

1

1( )q

ii

LLθη=

−=

∏

the full solution for ty is,

(66) 1

1

1

11 .

1

q

tpii

t i tpii

ii

L

y aL

ηε

ξξ

=

=

=

−

= + −

∏∑

∏

For a stationary process, with all the roots of ( )Lφ lying outside the unit circle, the first term will

approach zero as t gets large, from precisely the same arguments as for the pure AR(p) process in

(.). The second term can be split into partial fractions and a binomial expansion in ascending

powers of L applied to give the infinite moving average representation,

( ) ,t ty Lψ ε=

where

(67) 1( ) ( ) ( ) .L L Lψ φ θ−=

The ARMA(p,q) model is attractive since it uses the fact that a ratio of lag polynomials can

approxImate the infinite moving average polynomial ( ).Lψ This is essentially based on

Weierstrass's theorem on the approximation of functions through a ratio of polynomials. In most

applications the values of p and q are expected to be relatively small, i.e. either 0,1 or 2. The

38

properties of ARMA processes are very similar to those of pure autoregressions. The inclusion of

moving average terms will just provide some additional flexibility in accounting for low order

autocorrelation structure. To derive the autocorrelation function of ARMA processes we proceed

as before and note that

( ) 0 ,t t kE yε − = for 1k ≥

and

2( ) .t tE yε σ=

The general cross covariance function between ty and tε is defined as

( ) ,k t k tE yω ε −= 1,0,1,k = −… …

Obviously, 20ω σ= and 0kω = for 1, 2, .k =− − … On successively multiplying through the

model by ty 1 ,t t ky y− −… and on taking expectations,

21 1 1 1

21 1 1 1 1 1 1 1 1

1 1 1

( ) ( ) ( ) ( ) ( ) ( ) ,

( ) ( ) ( ) ( ) ( ) ( ) ,

( ) ( ) ( ) ( ) (

t t t p t p t t t t t q t t q

t t t p t p t t t t t q t t q

t t k t t k p t p t k t k t t

E y E y y E y y E y E y E y

E y y E y E y y E y E y E y

E y y E y y E y y E y E y

φ φ ε θ ε θ ε

φ φ ε θ ε θ ε

φ φ ε θ

− − − −

− − − − − − − − −

− − − − − −

= + + + − − −

= + + + − − −

= + + + −

… …

… …

… 1) ( ) .k t q t k t qE yε θ ε− − − −− −…

39

This realizes,

20 1 1 2 2 1 1 2 2 ,p p q qγ φ γ φ γ φ γ σ θ ω θ ω θ ω= + + + + − − − −… … k = 0

21 1 0 2 1 1 1 2 1 1 ,p p q qγ φ γ φ γ φ γ θ σ θ ω θ ω− −= + + + − − − −… … k = 1

21 1 1 1 ,q q q p q p qγ φ γ φ γ φ γ θ σ− − −= + + + −… k = q

and in general

1 1 2 2k k k p k pγ φ γ φ γ φ γ− − −= + + +… 1k q≥ +

(69) ( ) ( ) 0k kL Lφ γ φ ρ= = 1k q≥ +

Hence after q initial lags, the autocorrelation function of the ARMA(p,q) model will behave like

that of the AR(p) model. This follows from the fact that the autocorrelation coefficients obey a

difference equation that is generated purely from the autoregressive structure. If q < p the whole

autocorrelation function will behave like that of the AR(p) model. If q p≥ there will be q + 1 -

p initial autocorrelations before the typical AR(p) autocorrelation pattern sets in.

The Wold decomposition of the ARMA(p,q) Process

On dividing both sides of equation (69) by the autoregressive operator,

40

( ) ( ) ,( )t t tLy LL

θ ε ψ εφ

= =

hence,

( ) ( ) ( ) ,L L Lθ φ ψ≡ or

( ) ( )( )21 1 1 21 1 1q p

q pL L L L L Lθ θ φ φ ψ ψ− − − ≡ − − − + + +… … …

In order to derive the jψ coefficients it is simplest to just equate powers of L:

1 1 1

2 2 1 1 2

3 3 1 2 2 1 3

θ ψ φ

θ ψ φψ φ

θ ψ φψ φ ψ φ

− = −

− = − −

− = − − −

1 1q q q qθ ψ φψ φ−− = − − −… if p q>

Hence after p initial values the typical infinite moving average process coefficients also obey the

same difference equation as the autocorrelation coefficients.

Some Examples in Depth: the ARMA(1,1) Process

1 1t t t ty yφ ε θε− −− = −

41

and

20

21 1 1 1 1 1

2

( )

( ) ( ) ( ) ( )

( )

t t

t t t t t t t

E y

E y E y E E

ω ε σ

ω ε φ ε ε ε θ ε

φ θ σ

− − − − −

= =

= = + −

= −

The Yule-Walker equations are:

2 20 1 ( )γ φγ σ θ φ θ σ= + − −

21 0γ φγ θσ= −

1k kγ φγ −= 2k ≥

On solving these equations,

22

0 2

1 2

1

θ θφγ σ

φ

+ − = −

( )( )21 2

1

1

φθ φ θγ σ

φ

− − = −

Hence, 0 1,ρ =

( )( )1 2

1

1 2

φθ φ θρ

θ θφ

− − = + −

and

1 ,k kρ φρ −= 2k ≥

42

Note that since q = p = 1, there are q + 1 - p = 1 preliminary autocorrelations before the typical

AR(1) pattern sets in. The infinite moving average representation is then given by

(70)

1

2 2

1

(1 )(1 ) ,

(1 )(1 ) ,

( ) .

t t

t

it t i

i

y L L

L L L

θ φ ε

θ φ φ ε

ε φ θ φ ε

−

∞

−=

= − −

= − + + +

= + − ∑

…

Hence, 1 ( ) ,ψ φ θ= − 2 ( ),ψ φ φ θ= − … 1( ).kkψ φ φ θ−= − Alternatively it is possible to use the fact

that 1 ,k kψ φψ −= for 2k ≥ and then 1( ).jjψ φ φ θ−= −

An ARMA(2,1) Process

(72) ( ) ( ) ( )31 11 2 14 8 3 ,t t t t ty y y ε ε− − −= + + −

where 2314 8( ) (1 )L L Lφ = − − has roots of 4

3 and -2, while 13( ) [1 ( ) ]L Lθ = − has a root of 3; so

that the process is stationary and invertible.

2( ) ,t tE yε σ=

2 2 21 1 11 1 4 3 12( ) ( ) ,t tE yε ω σ σ σ− = = − = −

( ) 0t t kE yε − = 1k ≥

Hence,

( ) ( ) ( )( )2 231 1 10 1 24 8 3 12 ,γ γ γ σ σ= + + − −

43

( ) ( ) ( ) 231 11 0 14 8 3 ,γ γ γ σ= + −

( ) ( )312 1 04 8 ,γ γ γ= +

and

( ) ( )311 24 8k k kγ γ γ− −= + 2k ≥ .

From the second equation,

( ) ( ) ( ) 25 1 11 08 4 3 ,γ γ σ= −

hence ( ) ( ) 2821 05 15γ γ σ= − and ( ) ( ) 219 2

2 040 15γ γ σ= − . Then,

( ) 224320 2079γ σ= and ( ) 2170

1 2599γ σ= −

Hence 0 1ρ = and 1 0.0559,ρ =− which can be used as initial conditions to obtain

( ) ( )3 14 2

k kk A Bρ = + −

Then 1 = A + B and -.0559 = ( )34 A ( )1

2 ;B− hence A = .4447 and B = .5553 and the

autocorrelation coefficients are,

(73) ( ) ( )3 14 2.4447 .5553 ,k k

kρ = + − k = 0, 1, 2, …

Lag k 1 2 3 4 5 6 7 8 9

kρ .0559 .3610 .0693 .1527 .0642 .0733 .0424 .0381 .0254

The infinite moving average representation is given by,

44

( ) ( )( )2 231 11 23 4 81 1 1L L L L Lψ ψ − ≡ − − + + + …

Equating coefficients,

( ) ( )( ) ( )( ) ( )( ) ( )

1 113 4

3 11 28 4

3 11 2 38 4

3 12 3 48 4

,

0 ,

0 ,

0 ,

ψ

ψ ψ

ψ ψ ψ

ψ ψ ψ

− = − +

= − − +

= − − +

= − − +

( )11 12 ,ψ = − 17

2 48 ,ψ = 113 192 ,ψ = 113

4 768 ,ψ = 1795 3072 ,ψ = 857

6 12288 .ψ =

Lag k 0 1 2 3 4 5 6

kψ 1.0000 -.0833 .3542 .0573 .1471 .0583 .0697

Predictions from ARMA Processes

It is now convenient to consider the problem of finding the best forecast, i.e., minimum MSE

(mean squared error) prediction of t sy + made at time t. In this case the forecast origin is said to

be at time t and the forecast horizon is s. It is usual to consider the minimum mean square error

predictor and to find the linear combination, 1 1 ,tt j t jjy yφ+ −=

= ∑ such that

2

1 11

t

t j t jj

E y yφ+ + −=

−∑

is minimized. The following predictor will be based on all relevant and available information at

45

time t. The prediction of t sy + at time t is expressed as,

(75) ( ), 1 2| , , ,t s t t s t s t t ty E y E y y y y+ + − −= = …

so that Et represents an expectation conditioned on information that is available at time t. This

conditional expectation then defines the minimum MSE predictor with reference to the

information set 1 2, , ,.....t t ty y y− − Then,

(76) ,t t s t sE y y+ = for s = 1, 2, …

and

(77) t t s t sE y y+ += for s = 0, -1, -2, …

For the innovation process tε it is true that

(78) 0t t sE ε + = for s = 1, 2, …

and

(79) t t s t sE ε ε+ += for s = 0, -1, -2, …

46

where, ,t sy is a prediction of t sy + made at time t; then 2,( )t s t sE y y+ − is minimized by

, .t s t t sy E y += Forecasts can be made directly from the ARIMA model, or alternatively from the

infinite moving average or autoregressive representations. From the infinite MA representation,

0,t j t j

jy ψ ε

∞

−=

= ∑ 0 1ψ =

then

(80) ( ) ( )1 1 1 1 1 1t s t s t s s t s t s ty ε ψ ε ψ ε ψ ε ψ ε+ + + − − + + −= + + + + + +… …

hence

(81) 1 1t t s s t s tE y ψ ε ψ ε+ + −= + +…

so that 0

.t s j s t jj

y ψ ε∞

+ + −=

= ∑ The forecast error is , ,t s t s t se y y+= − and is given by,

47

( ), ,

1 1 1 1

1

0

( )

,

t s t s t s

t s t s s s

s

j t s jj

e y y

ε ψ ε ψ ε

ψ ε

+

+ + − − +

−

+ −=

= −

= + + +

= ∑

…

and the MSE of the s step ahead prediction is given by

(83) ( )1

2 2, ,

0( ) .

s

t s t s jj

MSE y Var e σ ψ−

=

= = ∑

which is the simplest method for finding the prediction MSE. However, the actual values of the

predictions are most easily made recursively using the above conditional expectations for the ty

and tε random variables. Recursions can be straightforwardly obtained from the ARMA model

formulations and are best illustrated by some simple examples.

Example 1: AR(1):

Consider the standard AR1) model, 1 .t t ty yφ ε−= + For one step ahead prediction,

1t tE y + = ( )1 ,t t tE yφ ε ++

and hence ,1ty = .tyφ At time t+2, the model is, 2ty + = 1 2 ;t tyφ ε+ ++ and on taking expectations

through the equation, conditional on information at time t:

48

( ) 2,2 ,1 .t t t ty y y yφ φ φ φ= = =

Similarly, since 1 ,t s t s t sy yφ ε+ + − += + it follows that

(84) , , 1

.

t s t s

st

y y

y

φ

φ

−=

=

Since jjψ φ= for the AR(1) process, it follows that the MSE associated with the predictor will

be,

( )

( )( )

12 2

,0

22

2

,

1.

1

sj

t sj

s

MSE y σ φ

φσ

φ

−

=

=

−=

−

∑

It is interesting to note that for very long lead times, as ,s →∞ then 0,t sy + → which is the

unconditional mean of ,ty while the MSE becomes,

( ) ( )2

, 2,

1s t sLIM MSE y σ

σ→∞ =

−

which is the unconditional variance of .ty Since the process is stationary, the value of

information at time t is of limited value when predicting a long way into the future.

Consequently the variance of a forecast a long way into the future is merely the unconditional

49

variance of the process. This is a general property of long term predictions from all stationary

processes; for the stationary and invertible class of ARMA models the effect occurs at an

exponential rate.

The ARMA(1,1) model,

The model is,

(85) 1 1 ,t t t ty yφ ε θε− −= + −

and since 1 1 ,t t t ty yφ ε θε+ += + − it follows that ,1 .t t ty yφ θε= − While at time t+2,

( ),2 ,1 2 1

,1

2

,

,

t t t t t

t

t t

y y E

y

y

φ ε θε

φ

φ φθε

+ += + −

=

= −

In general for the ARMA (1,1) model and for 2,s ≥

( ), , 1 1

, 1

1

,

( ).

t s t s t t s t s

t s

st t

y y E

y

y

φ ε θε

φ

φ φ θε

− + + −

−

−

= + −

=

= −

The method is then easily applied to higher order ARMA(p,q) processes.

Documents

CHAPTER 3 STATIONARY LINEAR TIME SERIES ...baillie/822/stationary.time.series...This chapter is concerned with the theory of stationary, univariate time series observed at discrete