Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
1
CHAPTER 3
STATIONARY LINEAR TIME SERIES PROCESSES
© Richard T Baillie, November, 2003
Introduction
This chapter is concerned with the theory of stationary, univariate time series observed at
discrete intervals of time. An understanding of these basic models is necessary to appreciate
more complicated dynamic econometric models that will be developed later. The time domain
population characteristics are derived for various univariate time series processes, including the
autocovariance, the autocorrelation function and the Wold decomposition. The emphasis here is
to provide the basic concepts and techniques, which are essential for all the models to be
developed later. There will inevitably be quite a lot of algebraic methods and tricks that are
worth knowing for the future.
A time series process ty is assumed to measure a quantitative variable, associated with an underlying ordered
Hence this theory is designed to deal with economic and financial series such as quarterly GNP,
monthly inflation, weekly money supplies, daily exchange rates, hourly IBM stock prices, etc.
Clearly, the methodology and issues are relevant to any science where the analysis of time series
data is important; and many if not most subjects use such data.
Stationarity
A time series is considered to be generated as a realization of the stochastic process .
( ) : 0, 1, 2,...y t t = ± ± The time series process is said to be Strongly Stationary, or Strictly
Stationary, if the joint distribution of the set of random variables 1 2
, ,..... st t ty y y is the same as
2
the joint distribution of set 1 2
, ,...... st k t k t ky y y+ + + for all s tuples 1 2( , ,..... )st t t and k of integers.
Hence the joint distribution of the process is independent of time, and only depends on the
intervals between the time points and not on the location of the time points relative to the time
origin.
If the joint distribution of 1 2
, ,..... st t ty y y is multivariate Normal, then ty is said to be a
Gaussian process and is fully defined by its mean vector and covariance matrix. In this case, if
the ( )tE y µ= and is constant and also if ( )t t kCov y y − is only a function of the lag between the
observations, i.e. k, then the process can be shown to be strictly stationary.
Although of considerable theoretical interest, strict stationarity is generally of limited use as a
theoretical construct and is also not readily testable from an empirical realization of a time series.
For this reason, other ways of describing time series have to be considered.
In particular, a time series process is said to be Weakly Stationary, or Covariance
Stationary if: (i) the mean, ( )tE y is constant, (ii) the variance, ( )tVar y ) is constant; and (iii) the
autocovariance function ( )t t kCov y y − is only a function of the lag k and is independent of time.
In which case the autocovariance function is expressed as,
(1) Cov( )k t t ky yγ −= k = 0,1,2,...
(2) ( ) ( ) ( )k t t k t t kE y y E y E yγ − −= −
and turns out to be an extremely important way to characterize the behavior of a theoretical time
3
series process. The empirical counterpart from a sample realization is equally important for ana-
lyzing empirical data.
The Autocovariance Function, ,kγ is equivalent to the usual covariance operator, and
measures the degree of association between the process at time t and at k lags in the past. Hence,
the autocovariance function measures the time dependence, or internal structure of the memory
of the process.
Analogously to regular statistical work, the covariance function is more usefully replaced by
the correlation coefficient, which is independent of the scale of measurement. Hence the Auto-
correlation Function of lag k is defined as,
(3) 12
( ) , 0,1, 2, ,[Var( )Var( )]
t t kk
t t k
Cov y y ky y
ρ −
−
= = …
where 0ρ = 1 and 1kρ ≤ for integer k. For a weakly stationary process with
0( ) ( )t t kVar y Var y γ−= = , then
(4) 0
0,1, 2, ,kk kγ
ργ
= = …
Also, for all real valued processes yt, the autocovariance function is symmetric, so that kγ = kγ −
and kρ = .kρ− A graph of the autocorrelation function kρ on the vertical axis and the lag k on
the horizontal axis, is known as a Correlogram.
4
The concept of weak stationarity implies that for any realization of a stationary time series
process, different pieces or sections of the series, will have identical data generating process, but
numerical values of the sections of the realization will only differ due to the presence of different
random innovations. An important corollary of stationarity is that
(5) 0 ,k kLim ρ→∞ =
so that the degree of dependency, or memory, of the time series process, goes to zero as the lag
between the observations increases .It should be noted that a series of i.i.d. Cauchy random
variables are strongly but not weakly stationary; while a series of random variables with mean of
zero, variance of unity, but different fourth moments is weakly but not strongly stationary. For
most practical purposes the concept of weak stationarity is the definition that will be used.
Ergodicity:
A stationary process is ergodic if there is a tendency to independence as the lag between
observations increases. Hence ergodicity implies that as k →∞ , then ( ) 0t t kCov y y − → , so there
is limited memory. The ergodic property is especially important for establishing consistency of
sample estimates of their corresponding population quantities.
3. Linear Time Series Processes
A first building block of stationary time series processes, is the important concept of White
Noise. The simplest type of discrete time series process is described as white noise and is generally
5
represented by the random variable .tε In all the models, the concept of white noise plays a
crucial role in the generation of the process. White noise is also known as an innovation, shock
or disturbance term of an underlying time series process. There are various definitions, which
require different strength of assumptions.
Weak Definition of White Noise: The variable tε is defined to have a zero mean, have a
constant variance 2σ and is serially uncorrelated, so that
( ) 0,tE ε =
2Var( )tε σ=
and
( ) 0k t t kCovγ ε ε −= = for 0.k ≠
It is important to note that the weak assumption of white noise merely requires the tε process to
be zero mean, serially uncorrelated and that its unconditional variance is constant; i.e.,
homoskedastic.
Martingale Definition of White Noise: The variable tε is defined to be unpredictable in its
conditional mean, so that
(6) 1 1( ) 0t t t tE Eε ε− −Ω = = ,
where 1t−Ω is a sigma field or information set available at time t – 1. The martingale assumption
6
implies a zero conditional mean and also that the innovations, or white noise process is serially
uncorrelated.
Strong Definition of White Noise: The distribution of the variable tε is defined, so that
2~ . . .(0, )t i i dε σ ,
and sometimes the precise nature of the distribution is specified, such as 2~ (0, )t NIDε σ .The
white noise process acts as the "forcing term" in more complicated time series models, such as
ARMA, to be considered later. White noise is also commonly referred to as a disturbance or
error in regression analysis, and as an innovation, or random shock in macroeconomics.
The Wold Decomposition
The famous Wold Decomposition first appeared in the book by Herman Wold (1954), "A
Study in the Analysis of Stationary Time Series". Wold showed that any stationary time series
process can be uniquely represented as the sum of two mutually uncorrelated processes tξ and
,tη where tξ is a moving average process of infinite order, and tη is a purely deterministic
process, such as a sine wave with fixed period. Then,
(7) ,t t ty ξ η= +
7
where,
(8)0
( ) ,t j t j tj
Lξ ψ ε ψ ε∞
−=
= =∑
and 0
( ) jj
jL Lψ ψ
∞
=
=∑ with 0 1,ψ = ( ) 0,tE ε = 2 2( ) ,tE ε σ= ( ) 0,t sE ε ε = for s # t and 20 jj ψ∞
=∑ <
,∞ which is known as the squared summability condition.
If ty is a sequence of random variables such that sup ,t tE y < ∞ and if jj ψ∞
=−∞∑ < ,∞
then yt is said to converge absolutely with probability one. If 2sup ,t tE y <∞ then ty is said to
converge in mean square to the same limit. Brockwell and Davis (1988), pages 83-84 provide
more detail on the probability aspects of convergence of the ty process.
It most econometric work it is usual to neglect the second term of equation (7) on the
grounds that purely deterministic components such as fixed period harmonics are very unlikely
to occur in economics and finance applications. However, their presence is sometimes regarded
as necessary in applications in electrical engineering, oceanography and cardiac rhythm data,
where regular cycles are expected to occur. In all the following material , it is assumed that a
stationary time series process can be completely represented by the infinite moving average
process. The Wold decomposition is therefore in terms of an Infinite Moving Average
Representation, which in recent econometric terminology, particularly in macroeconomics, is
named the Impulse Response Weights.
If a time series possesses an infinite order moving average representation, from the Wold
8
Decomposition, then it will also have a corresponding Infinite Autoregressive Representation
given by
(9) ( ) t tL yπ ε=
where 1
( ) 1 jj
jL Lπ π
∞
=
= −∑ . The representations ( )t ty Lψ ε= and ( ) t tL yπ ε= are mutual inverses
since
(10) ( ) ( ) 1.L Lπ ψ ≡
There are several useful algebraic tricks to getting from one representation to another.
Subsequent examples will illustrate some of the more widely used methods. It is important to
note that the Wold decomposition implies the use of either the infinite order moving average, or
the infinite autoregressive representations to describe any stationary time series process.
However, from a practical perspective a lot of time series analysis attempts to replace these
infinite order representations with finite parameter models, which describe some salient features
of the process. The simplest first order processes are:
Autoregressive of order one, denoted as AR(1) ,
1 .t t ty yφ ε−= +
Moving Average of order one, denoted as MA(1),
9
1 ,t t ty ε θε −= −
and the combined Autoregressive Moving Average, denoted as ARMA(1,1),
1 1 .t t t ty yφ ε θ− −= + −
Higher order processes, i.e. involving further lagged terms are discussed in a later section.
Clearly, autoregressive models are closely related to regressions where lagged dependent vari-
ables fulfill the role of explanatory variables. Models with moving average terms are like regres-
sions with unobservables as explanatory variables. The most general linear time series process
considered in this section is the ARMA(p,q) model,
(11) 1 1 1 1t t p t p t t q t qy y yφ φ ε θ ε θ ε− − − −= + + + − − −… …
It turns out that quite low order ARMA models are capable of representing a wide variety of
behavior of time series processes. The ARMA(p,q) process can be expressed in lag operator form
as,
(12) ( ) ( ) ,t tL y Lφ θ ε=
where,
(13) 21 2( ) (1 )p
pL L L Lφ φ φ φ= − − − −… and 21 2( ) (1 ).q
qL L L Lθ θ θ θ= − − − −…
10
It is also frequently convenient to use infinite order processes, namely the infinite order moving
average representation,
(14) 0
,t j t jj
y µ ψ ε∞
−=
= +∑ with 0 1,θ =
and the infinite order autoregressive process,
(15) 1
.t j t j tj
y yµ π ε∞
−=
= + +∑
The infinite order moving average process is thus a linear combination of past uncorrelated inno-
vations tε terms and the weights jψ represent the relative importance of previous shocks, or
innovations. Both representations are linear in parameters and in terms of past ' .ty s . Also both
representations depend on an infinite number of parameters, sometimes referred to as the ψ
weights and π weights. In both the above representations the process is allowed to have a
possibly non-zero mean .µ In many cases µ is omitted on the understanding that the stationary
time series process ty has been expressed in deviation form.
The Finite Moving Average Process: MA(q)
The model is based on the notion that the infinite moving average representation of a station-
ary time series ty can be truncated after q lags. The model was originally proposed by Yule in
11
the 1930's, well before the work of Wold in 1954 on the general decomposition of stationary
time series into infinite moving averages and harmonic components. The general moving aver-
age process of order q, which is denoted by MA(q) is,
(16) 1 10
( ) ,t t t q t q j t j tj
y Lε θ ε θ ε θ ε θ ε∞
− − −=
= + + + = =∑…
where ,0( ) qjjLθ θ
==∑ 0 1,θ = jθ are real numbers and 2
0 .qjj θ
=<∞∑ An important special
case is the MA(1) process,
(17) 1 ,t t ty ε θε −= +
The variance, or autocovariance at lag zero is defined as,
2 20 1
2 2 2 21 1
( ) ( ) ,
( 2 ) (1 ) .
t t t
t t t t
E y E
E
γ ε θε
ε θε θε ε θ σ
−
− −
= = +
= + + = +
Similarly,
1 1 1 1 2
2 2 2 21 2 1 1 2
( ) ( )( )
( ) .
t t t t t t
t t t t t t t
E y y E
E
γ ε θε ε θε
ε ε θε ε θε θε ε θσ
− − − −
− − − − −
= = + +
= + + + =
Also,
2 2 1 2 3
22 3 1 2 1 3
( ) ( )( )
( ) 0.
t t t t t t
t t t t t t t t
E y y E
E
γ ε θε ε θε
ε ε θε ε θε ε θ ε ε
− − − −
− − − − − −
= = + +
= + + + =
12
Similarly, 0kγ = for 2.k > The autocorrelation function for the MA(1) process is then
(18) 0 1,ρ = 1 2(1 )θρθ
=+
and 0kρ = for 2.k ≥
Frequently the autocorrelation function is graphed against its lag k; such a graph is called a
correlogram. For the MA(1) process it takes the form given in figure 1. The autocorrelation
function for the MA(q) process is,
(19) 0
,q
t j t jj
y θ ε −=
= ∑
where 0 1θ = and since ( ) 0,tE ε = it follows that ( ) 0.tE y = The variance of the MA(q) process
is given by
(20) 2 2 20
0( )
q
t jj
E yγ σ θ=
= = ∑
0 0
0 1 1 2 2 1 1
0 1 1 2 2
20 1 1 2 2
( )
( )
x ( )
( ).
q q
k t t k i t i r t k ji j
t t t k t k k t k q t q
t k t k t k q t q k
k k k k q q
E y y
E
γ θ ε θ ε
θ ε θ ε θ ε θ ε θ ε θ ε
θ ε θ ε θ ε θ ε
σ θ θ θ θ θ θ θ θ
− − − −= =
− − − + − − −
− − − − − − −
+ + −
= =
= + + + + + + +
+ + + + +
= + + + +
∑ ∑
… …
… …
…
13
2
0
q k
k i i ki
γ σ θ θ−
+=
= ∑
Then
(21) 0
2
0
,
q k
i i ki
k q
ii
θ θρ
θ
−
+=
=
=∑
∑0,1, ,k q= …
0, 1k k qρ = ≥ +
The above illustrates an important property of the finite order MA(q) process; namely that its
"memory" only lasts for q periods or lags. Hence the autocorrelation function is zero after q+1
lags, so that the process is independent of its behavior q+1 periods ago. For example, in the
MA(2) process,
(22) 1 11 24 8( ) ( ) ,t t t ty ε ε ε− −= + −
it follows that: 0 1,θ = 11 4 ,θ = and 1
2 8 .θ =− Then from the more general results,
2 2691 10 16 64 64(1 ) ( )γ σ σ= + + =
1
2 2 271 1 11 1 1 1 2 4 4 8 32
0( ) ( ) ( )i i
iγ θ θ θ θ θ σ σ σ+
=
= = + = − =∑
2 212 2 8γ θ σ σ= = −
0 ,kγ = for 3k ≥
14
Hence, ( ) ( )7 69 141 32 64 69/ ,ρ = = 8
2 69 ,ρ = − 3 4 .... 0.ρ ρ= =
Invertibility of MA Processes
The general finite order MA(q) moving average process given by (19) will always be
stationary provided all the coefficients are finite and 2 .jθ < ∞∑ However, the issue of
invertibility arises with moving average processes and is concerned with the identifiability of the
MA process. The moving average process (19) is said to be invertible if all the roots of ( )Lθ lie
outside the unit circle. In general, 2q different MA(q) processes will be observationally
equivalent in the sense they will possess identical autocorrelation structures. For example, the
MA(1) processes,
1 (1 )t t t ty Lε θε θ ε−= + = +
and
11 11 ,t t t ty Lε ε εθ θ− = + = +
both have the same autocorrelation structure of 21 /(1 )ρ θ θ= + and 0jρ = for 2.j≥ Since
21 1 0 ,θ ρ θ ρ− + =
15
it is clear that two values of θ can be obtained which are consistent with this equation; however
only one will lie outside the unit circle and be invertible. If 1 .4ρ = then
2 2.5 1 0
( .5)( 2) 0
θ θ
θ θ
− + =
− − =
so that .5θ = is the invertible solution, while 2θ = is the noninvertible solution. Both these
MA(1) processes have the same autocovariance structure.
More generally for the ARMA(p.q) model ( ) ( ) ,t tL y Lφ θ ε= the infinite autoregressive repre-
sentation ( ) t tL yπ ε= has to converge in some sense for invertibility. The usual requirement is
for 0 jj π∞
=< ∞∑ and a sufficient condition is for all the roots of ( )Lπ to lie outside the unit
circle. This restriction is somewhat arbitrary but is very convenient for avoiding model multipli-
city. The infinite order moving average representation ty = ( ) tLψ ε for stationary ty then gives
rise to the infinite autoregressive representation from the relationship 1( ) ( ) ,L Lπ θ −= so that
lim 0.j jπ→∞ = Hence there is the sensible requirement that the weight on observations a long
time ago to decline as the lag length increases.
The Autoregressive Process of Order of 1: AR(1)
Probably the most widely used model in time series and dynamic econometric work is the
first order autoregression, which is very useful for describing many of the issues in dynamic
models. The process, denoted by AR(1) is,
(25) 1 .t t ty yφ ε−= +
16
By successive substitution, the process can be expressed as
21 2 ,t t t ty yε φε φ− −= + +
and then
2 31 2 3
2 11 2 1
,
.
t t t t t
k kt t t t k t k
y y
y
ε φε φ ε φ
ε φε φ ε φ ε φ
− − −
+− − − − −
= + + +
= + + + +…
For stationarity it is necessary, 2ty = 2( )tE y = constant.
22 22( 1)
10
0 ,k
j kt t t j t k
jy y yφ ε φ +
− − −=
= − = →∑
as ,k→∞ in which case the process ty is said to be convergent in the Mean Square (MS) sense.
A sufficient condition for this is that 1.φ < Hence 0j
t jj φ ε∞−=∑ is mean square convergent and
ty = 0j
t jj φ ε∞−=∑ is valid in the mean square sense and with probability one. It is interesting to
examine the properties of the process when 1φ > and consequently lies in the non-stationary
region. In probability terms, the process does not converge in 2 .L However, the AR(1) process
can still be written as
17
1 1
2 1
1 2 1
1 1
1 1 1 1 .
t t t
k k
t t t k t k
y y
y
εφ φ
ε ε εφ φ φ φ
+ +
+
+ + + − −
= − +
= − − + − +
…
However, this process does not appear to be sensible since there does not seem any physical way
ty can be influenced by future random innovations. Hence φ is restricted to lie in the interval
1.φ < From the representation,
2 11 2 1. .k k
t t t t t k t ky yε φε φ ε φ ε φ +− − − − −= + + + + +…
the process can be expressed as ,
(26) 2 11 2 1 0. .t t
t t t ty yε φε φ ε φ ε φ−− −= + + + + +…
The initial value can be assumed fixed at 0 0y = , or the process can be assumed to have been
generated for very large number of periods. Hence the effect of the initial observation is
negligible. This shows that the infinite moving average representation, or impulse response
weights are given by,
(27) 0
tt t j
jy φ ε
∞
−=
= ∑
18
so that the impact of a shock j periods ago has coefficient of jφ and for stationary process,
1 1φ− < < , the impact of the shock is seen to die away at an exponential rate the further back in
time is examined. It can also be seen that ( ) 0,tE y = while the variance of ty can be found as,
2 2 21 2
22 4 2 2
2
( ) [ ]
(1 ) .(1 )
kt t t t t k
k
E y E ε φε φ ε φ ε
σφ φ φ σφ
− − −= + + + +
= + + + + + =−
… …
… …
The autocovariance function be similarly found as,
( )
2 1 21 2 1 1 2
22 4 2 2
2
( ) [ ] [ ]
(1 )1
k kk t t k t t t t k t k t k t k t k
kk k
E y y E xγ ε φε φ ε φ ε φ ε ε φε φ ε
σ φφ φ φ φ σφ
+− − − − − − − − − − −= = + + + + + + + +
= + + + + + =−
… … …
… …
The autocorrelation function is then
(28) 0
,kkk
γρ φ
γ= =
so that all the main properties of the process, i.e. infinite order moving average representation
weights, ,kψ and autocorrelation decay at an exponential (geometric) rate.
At the stage it is intuitively interesting to examine the case of 1φ = ; the so called unit root, or
random walk model for .ty Clearly the impulse responses in equation (28) will not decay and the
19
role of all past innovations or shocks are the same as the last periods weight. It should also be
obvious that the variance of the process is undefined when 1φ = , which indicates the need for
different types of investigation of this process. Figures 1 through 3 show sample realizations for
2,000 observations generated by AR(1) models for 0.3,φ = 0.6φ = and 0.9.φ = Clearly, as the
value of the autoregressive parameter increases the series takes on the appearance of having less
random structure and there becomes evident distinct patterns with one high observation being
likely to be followed by a further high observation. Similarly, a small observation is more likely
to be followed by a further small observation. Figure 4 shows a similar realization from the
AR(1) process with 1φ = . This process is now a unit root process, 1t t ty y ε−= + and clearly has
quite different behavior to the stationary AR(1) model with 1 1φ− < < . The series realization is
marked by slow drifts in the mean of the series through different levels and is visually entirely
different to the stationary AR model. The unit root process is non stationary and will be
discussed in more detail later.
The AR(p) Process
The autoregressive process of order p, is denoted by AR(p) and represents the current value
of the process as a linear combination of the last p lagged values of the process. The AR(p)
process can be regarded as a p'th order linear difference equation with the addition of a stochastic
disturbance, ,tε
(29) 1 1 2 2 ,t t t p t p ty y y yφ φ φ ε− − −= + + + +…
20
Or,
(30) ( ) ,t tL yφ ε=
where 1( ) (1 )ppL L Lφ φ φ= − − −… and tε is white noise. When analyzing and using the AR(p)
model, it is very important to know the following:
(i) under what conditions will ty be stationary?
(ii) what is the autocorrelation function of ?ty
(iii) will ty have a unique moving average representation?
All of the above issues can be answered from considering the solution of the difference equation
( ) 0.tL yφ = On taking the auxiliary equation ( ) 0mφ = and assuming that ( )mφ has p distinct
roots 1 ,ξ 2 , .pξ ξ… Then
(31) 1
( ) 1p
ii
mmφξ=
= −
∏
so that the general solution for ty is given by
(32) 1
1 ,ti
t iii
y cξ=
=
∑
21
where the ic are unknown constants to be determined from initial boundary conditions. For ty to
be stationary, any solution for ty must be stable and independent of t. Thus it is necessary for 1/ 1jξ <
and hence 1jξ > for all j. With this condition it then follows that lim (1/ ) 0,tj jξ→∞ = which
ensures that ty is stationary. The condition is generally expressed by saying that all the roots of
( )Lφ must lie outside the unit circle. The Particular Solution for ty is given by
(33) 1
1
1( ) 1 ,
p
t tii
LLφ ε εξ
−−
=
= −
∏
On expanding as partial fractions realizes,
11
1( ) 1 ,
p
t i ti i
LL aφ ε εξ
−
−
=
= −
∑
and since each 1,iξ > it follows that each binomial expansion 11 ( / )iL ξ −− will be valid and
will converge. On collecting terms,
(34) 1
11
1 0 0( ) 1 ( ) .
pi
t i t i t i ti i ii
LL a Lφ ε ε ψ ε ψ εξ
−∞ ∞
−−
= = =
= − = =
∑ ∑ ∑
The full solution for ty is then obtained by adding the particular solution to the general solution
to obtain,
22
(35) 1 1
1 ,tp
t i i t ii ii
y c ψ εξ
∞
−= =
= +
∑ ∑
and since all the roots of ( )Lφ lie outside the unit circle, the p exponential terms in (35) will
vanish for large t to leave the solution as the Infinite Moving Average Representation from the
Wold Decomposition, 0
.t i t ii
y ψ ε∞
−=
= ∑ In general, the following three conditions are equivalent
for the AR(p) process; each condition implies the other two:
(a) ty is stationary
(b) ty has a unique infinite order moving average representation
(c) all the roots of ( )Lφ lie outside the unit circle
Autocorrelation Function of the AR(p) Model
First, it is natural to impose the condition that
(36) ( ) 0t t kE yε − = for 1 ,k ≥
since a future realization of the random white noise process tε must be uncorrelated with past
realizations of the process .ty However,
(37) ( ) ( ) ( ) ( )21t t i t t p t t p tE y E y E y Eε φ ε φ ε ε− −= + + +…
23
so that ( ) 2 .t tE yε σ= On multiplying successively through ty by 1 ,ty − , 2 ,ty − … t py − and on
taking expectations, the following p equations are obtained,
1 1 0 2 1 1
2 1 1 2 0 2
1 1 2 2 0
,
,
.......................................
,
p p
p p
p p p p
γ φ γ φ γ φ γ
γ φ γ φ γ φ γ
γ φ γ φ γ φ γ
−
−
− −
= + + +
= + + +
= + + +
…
…
…
which are known as the Yule-Walker Equations. On noting that ,j jγ γ −= , the general equation
can be expressed as
(38) ( ) 0kLφ γ = 1,2,k p= …
and on dividing by 0 :γ
(39) ( ) 0kLφ ρ = 1, 2,k p= …
Since ty is stationary, all the roots of ( )Lφ lie outside the unit circle and jρ satisfies the same
difference equation as .ty If all the ,jξ which are the roots of ( )Lφ are real and distinct, then
24
(40) 1
1 ,kp
k ii i
aρξ=
=
∑
where the ia are unknown constants to be determined by solving the first p Yule-Walker equa-
tions as boundary conditions. In this case kρ is simply the sum of geometrically decaying terms;
i.e. since ( )1/ 0kiε → as ,k→∞ the autocorrelations decay as the lag increases, so that the
stationarity condition in equation (7) is satisfied..
One possibility is that a pair of roots iξ and iξ may be complex conjugates and will jointly
contribute a term of the form,
(41) ( )sin 2jic d j fπ +
to the autocorrelation function .jρ This term will be a damped harmonic, with f as the frequency
and d as the damping factor. One further possibility, which is relatively unlikely, is that two
roots are the same. This will contribute a term to kρ of the form:
(42) ( )21 .
k
ii
a a kξ
+
The theoretical autocorrelation functions of higher order AR(p) models with a combination of
real and complex conjugate roots in their autoregressive polynomial operators, will clearly be of
25
a complicated form and involve a combination of real and complex roots.
Infinite Moving Average Representation of the AR(p) Model
For the stationary AR(p) model ( ) ,t tL yφ ε= there exists the unique infinite order moving
average representation,
(43) 0
( ) .t j t j tj
y Lψ ε ψ ε∞
−=
= =∑
Since 1( ) ( ) ,t t ty L Lφ ε ψ ε−= = it follows from equation (10) that 1( ) ( ) ,L Lφ ψ− = and therefore
there is the following inverse relationship between the lag polynomials ( ) ( ) 1.L Lφ ψ = The most
direct way of finding the implied infinite moving average representation weights given
knowledge of the AR parameters is to solve recursively by writing it as:
(44) ( )( )2 21 2 1 21 1 1.p
pL L L L Lφ φ φ ψ ψ− − − − + + + ≡… …
Then on equating powers of L:
1 1 0ψ φ− = (coefficient of L)
2 1 1 2 0ψ φψ φ− − = (coefficient of L2)
3 1 2 2 1 3 0ψ φψ φ ψ φ− − − = (coefficient of L3)
and in general
26
1 1 2 2 0k k k p k pψ φψ φ ψ φ ψ− − −− − − − =… k p≥
or
(45) ( ) 0kLφ ψ = k p≥
so that the infinite moving average representation weights kψ satisfy the same difference equa-
tion as ty and .jρ On solving for kψ it is possible to obtain
1
1kp
k iii
bψξ=
=
∑
where b1, b2, … bp are to be determined from initial boundary conditions from directly solving
for the first p moving average representation coefficients. This is most easily seen from the
following examples.
Reanalysis of the AR(1) Process:
There are many tricks in the use of lag polynomials which can simplify the derivation of the
properties of time series models and these can be used interchangeably as is convenient. The
previous method for the derivation of the aspects of the AR(1) process can be simplified as the
following indicates. The first order autoregressive process, or AR(1) process is,
(46) 1 ,t t ty yφ ε−= +
27
Hence ( )Lφ = (1 ) ,Lφ− which has a root of 1/ .ξ φ= The condition for stationarity is that the root
must lie outside the unit circle, that is 1,ξ > which implies that 1.φ < On multiplying through
the model for ty by ,t ky − taking expectations and using the fact that ( )t t kE yε − = 0, for 1k ≥
gives,
1k kγ φγ −= 1k ≥
Multiplying through the AR(1) by ty and taking expectations gives
20 1 .γ φγ σ= +
But 1 0γ φγ= also, so that 2 20 0γ φ γ σ= + and 2 2
0 /(1 ).γ φ φ= − Hence
2
2(1 )
k
kφ σγ
φ=
−
and
(47) .kkρ φ=
The infinite moving average representation for the AR(1) process can be directly found as,
( ) ( )1 2 21 1t t ty L L Lφ ε φ φ ε−= − = + + +…
hence
28
(48) 0 0( ) .j j j
t t t jj j
y Lφ ε φ ε∞ ∞
−= =
= =∑ ∑
Alternatively, since ( ) 0kLφ ψ = for ,k p≥ then
( )1 0 ,kLφ ψ− = 1k ≥
1 ,k kψ φψ −= 1k ≥
and since 0 1,ψ = it follows that .kkψ φ= Hence for the AR(1) process, the infinite moving aver-
age representation weights and the autocorrelations both decline at an exponential rate.
The AR(2) Process
The general AR(2) process is given by,
(49) 1 1 2 2t t t ty y yφ φ ε− −= + +
For stationarity the roots of 21 2( ) (1 )L L Lφ φ φ= − − must lie outside the unit circle, which can be
shown to imply that,
(50) 1 2 1;φ φ+ < 2 1 1φ φ− < and 2 1 .φ <
The first two Yule-Walker equations are:
29
1 1 2 1
2 1 1 2 ,
ρ φ φ ρ
ρ φ ρ φ
= +
= +
which give
11
2(1 )φρφ
=−
and 2
2 12
2(1 )φ φρ
φ+
=−
and the general Yule-Walker equation is
(51) 1 1 2 2 ,k k kρ φ ρ φ ρ− −= + 2k ≥
An AR(2) Process with Real Roots
As an illustration of the above theory, consider the AR(2) process
(52) ( ) ( )311 24 8 ,t t t ty y y ε− −= + +
( ) ( ) ( ) ( )23 31 14 8 4 2( ) 1 1 1 0L L L L Lφ = − − = − + =
which has roots of 43 and -2. Since both roots are outside the unit circle it follows that ty is
stationary. From the first Yule-Walker equation
311 14 8 ,ρ ρ= +
30
it then follows that 21 5ρ = and, in general,
(54) 311 24 8 ,k k kρ ρ ρ− −= + for 2k ≥
Then,
( ) ( )3 14 2
k kk A Bρ = + − ,
where A and B are unknown constants. However, 0 1 A Bρ = = + and ( ) ( )32 11 5 4 2 ;A Bρ = = −
so that 1825A = and 7
25.B = Hence the full solution for the autocorrelation function is given by,
(55) ( ) ( )18 3 7 125 4 25 2
k kkρ = + − k = 0, 1, 2,…
The autocorrelations can now be calculated either directly from the above formula or from the
difference equation for .kρ To find the form of the infinite moving average representation
weights:
( ) 0kLφ ψ =
1 1 2 2k kψ φψ φ ψ− −= + 2k ≥
By direct substitution it is simple to determine the first few MA coefficients;
31
( )3 31 12 3 1 24 4 8 8
7 3 12 3 116 32 4 ,
t t t t t t
t t t t
y y y y
y y
ε ε
ε ε
− − − −
− − −
= + + + +
= + + +
so that 11 4ψ = and 0 1,ψ = which gives two initial conditions. Since the impulse response
weights follow the same difference equation as the autocorrelation coefficients, it follows that
( ) ( )3 14 2
k kk A Bψ = + −
and hence 1 = A + B and ( ) ( ) ( )31 14 4 2 ,A B= − which implies A = 3
5 and B = 25 . Hence
(56) ( ) ( )3 3 2 15 4 5 2
k kkψ = + − k = 0, 1, 2, …
An alternative, and in this instance, more algebraically involved approach, is to expand 1( )Lφ − in
series of ascending powers of the lag operator L, to give
( ) ( )
( ) ( )
11 2314 8
13 1
4 2
( ) 1
1 1
t t t
t
y L L L
L L
φ ε ε
ε
−−
−
= = − −
= − +
Then by the use of partial fractions,
( ) ( ) ( ) ( )
1 13 3 2 15 4 5 2
3 3 2 15 4 5 2
0
1 1
,
t
j jt j
j
L L ε
ε
− −
∞
−=
= − + +
= + −∑
32
which is identical to equation (56) as before. The first few values of the autocorrelation function and infinite mov
Lag 0 1 2 3 4 5 6 7
jρ 1.00 0.40 0.48 0.27 0.25 0.16 0.13 0.09
jψ 1.00 0.25 0.44 0.20 0.21 0.13 0.11 0.06
The Variance of an AR(2) Process
From solving the first two Yule-Walker equations appended with a similar equation obtained
by multiplying through the process by ty and taking expectations. Then
20 1 1 2 2
1 1 0 2 1
2 1 1 2 0
γ φ γ φ γ σ
γ φ γ φ γ
γ φ γ φ γ
= + +
= +
= +
Then from the second Yule Walker equation,
20 1
1
1 φγ γφ
−=
while 21 1 0 2 2φ γ γ φ γ σ= − − and on substituting for 2γ from gives
2 21 1 0 1 2 1 2 0φ γ γ φ φ γ φ γ σ= − − −
hence
33
( )( )
2 22 0
11 2
1
1
φ γ σγ
φ φ
− − =+
and substituting into the first Yule Walker equation,
( )( ) ( ) ( )2 2 10 2 2 0 2 2 21 1 1 1γ φ φ γ φ σ φ φ = − − − − +
hence
(59) ( )
( ) ( ) 2
20 2 2
2 2 2
1
1 1
φ σγ
φ φ φ
− = + − −
In the previous example 11 4 ,φ = 3
2 8φ = and ( ) 23200 231γ σ= = 21.3853 .σ Alternatively, the
same result can be derived from the infinite moving average representation,
( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( )( ) ( )( ) ( )( )
22 2 23 3 2 1
0 5 4 5 20 0
2 2 29 3 34 1 1225 4 25 2 25 8
0 0 0
29 16 84 4 1225 7 25 3 25 11
2
,
,
,
1.3853 .
j jj
j j
j jj
j j j
γ σ ψ σ
σ
σ
σ
∞ ∞
= =
∞ ∞ ∞
= = =
= = + −
= + − + −
= + +
=
∑ ∑
∑ ∑ ∑
An AR(2) Process with Complex Roots
A further numerical example is an AR(2) process with complex roots,
34
(60) ( )11 22 ,t t t ty y y ε− −= − +
where 212( ) 1L L Lφ = − + has roots of 1 ± i, which are complex conjugates. Recall that the
modulus, or absolute value of a complex number ,a bi+ is given by a bi+ = 2 2 ;a b+ and a
modulus greater than one implies a root outside the unit circle. For the process under
consideration both roots lie outside the unit circle, so the process is stationary. From the Yule-
Walker equations:
(61) ( )11 22 ,k k kρ ρ ρ− −= − 2k ≥
and
( )11 121 ,ρ ρ= −
so that 21 3ρ = and 0 1.ρ = On using de Moivre's theorem,
(62) cos( ) sin( )ie iθ θ θ= +
the inverses of the two complex roots (1+i) and (1-i), can be expressed as 11/ξ = ide θ and
21/ξ = ,ide θ− where d = ( )1
22φ− and is known as the damping factor and indicates the degree of
decay of the harmonic cycle in the autocorrelation function. Also on writing,
( )1
2
1
2
cos( )2
φθφ
= −
35
and
( )2
2(1 )tan( ) tan( ) ,1
dd
ω θ + = −
the autocorrelation function can then be expressed more conveniently as,
(63) ( )( )
sin(
sin
k
k
d kθ ωρ
ω
+ =
Hence kρ takes the form of a damped harmonic. A detailed proof of the above result is given by
Box and Jenkins (1970; pp. 58-63). In the above example 1 1φ = and 12 2 ,φ =− so that d =
121/(2) , cos( )θ =
121/(2) and hence / 4.θ π= Also, tan( )Φ = 3[tan( )] 3θ = and
(64)
( ) ( )
( )( ) ( )
( )
( )( )
2
2
12
2
12 1
1
12
12
sintan (3)
4
sin tan (3)
sin.3973
4
3 10
1.0541 sin .39734
k
k
k
k
k
k
k
π
ρ
ππ
π π
−
−
+ =
+ =
= +
The first few values of the autocorrelation function are tabulated below:
36
Lag
(j) 0 1 2 3 4 5 6 7 8 9 10 11 12
jρ 1.00 .67 .17 -.17 -.25 -.17 -.04 .04 .06 .04 .01 -.01 -.01
The Autoregressive Moving Average, or ARMA Process
The AR(p) model when appended with a MA(q) error structure realizes the ARMA (p,q)
model, which is one of the most widely used models to represent a stationary time series. The
ARMA(p,q) model is defined as,
(65) 1 1 1 1 ,t t p t p t t q t qy y yφ φ ε θ ε θ ε− − − −− − − = − −… …
Or,
( ) ( ) .t tL y Lφ θ ε=
It is assumed that all the roots of ( )Lφ and ( )Lθ lie outside the unit circle so that the stationarity
and invertibility conditions are satisfied. It also assumed that ( )Lφ and ( )Lθ do not share any
common factors.
1
1( )p
ii
LLφξ=
−=
∏
and
37
1
1( )q
ii
LLθη=
−=
∏
the full solution for ty is,
(66) 1
1
1
11 .
1
q
tpii
t i tpii
ii
L
y aL
ηε
ξξ
=
=
=
−
= + −
∏∑
∏
For a stationary process, with all the roots of ( )Lφ lying outside the unit circle, the first term will
approach zero as t gets large, from precisely the same arguments as for the pure AR(p) process in
(.). The second term can be split into partial fractions and a binomial expansion in ascending
powers of L applied to give the infinite moving average representation,
( ) ,t ty Lψ ε=
where
(67) 1( ) ( ) ( ) .L L Lψ φ θ−=
The ARMA(p,q) model is attractive since it uses the fact that a ratio of lag polynomials can
approxImate the infinite moving average polynomial ( ).Lψ This is essentially based on
Weierstrass's theorem on the approximation of functions through a ratio of polynomials. In most
applications the values of p and q are expected to be relatively small, i.e. either 0,1 or 2. The
38
properties of ARMA processes are very similar to those of pure autoregressions. The inclusion of
moving average terms will just provide some additional flexibility in accounting for low order
autocorrelation structure. To derive the autocorrelation function of ARMA processes we proceed
as before and note that
( ) 0 ,t t kE yε − = for 1k ≥
and
2( ) .t tE yε σ=
The general cross covariance function between ty and tε is defined as
( ) ,k t k tE yω ε −= 1,0,1,k = −… …
Obviously, 20ω σ= and 0kω = for 1, 2, .k =− − … On successively multiplying through the
model by ty 1 ,t t ky y− −… and on taking expectations,
21 1 1 1
21 1 1 1 1 1 1 1 1
1 1 1
( ) ( ) ( ) ( ) ( ) ( ) ,
( ) ( ) ( ) ( ) ( ) ( ) ,
( ) ( ) ( ) ( ) (
t t t p t p t t t t t q t t q
t t t p t p t t t t t q t t q
t t k t t k p t p t k t k t t
E y E y y E y y E y E y E y
E y y E y E y y E y E y E y
E y y E y y E y y E y E y
φ φ ε θ ε θ ε
φ φ ε θ ε θ ε
φ φ ε θ
− − − −
− − − − − − − − −
− − − − − −
= + + + − − −
= + + + − − −
= + + + −
… …
… …
… 1) ( ) .k t q t k t qE yε θ ε− − − −− −…
39
This realizes,
20 1 1 2 2 1 1 2 2 ,p p q qγ φ γ φ γ φ γ σ θ ω θ ω θ ω= + + + + − − − −… … k = 0
21 1 0 2 1 1 1 2 1 1 ,p p q qγ φ γ φ γ φ γ θ σ θ ω θ ω− −= + + + − − − −… … k = 1
21 1 1 1 ,q q q p q p qγ φ γ φ γ φ γ θ σ− − −= + + + −… k = q
and in general
1 1 2 2k k k p k pγ φ γ φ γ φ γ− − −= + + +… 1k q≥ +
(69) ( ) ( ) 0k kL Lφ γ φ ρ= = 1k q≥ +
Hence after q initial lags, the autocorrelation function of the ARMA(p,q) model will behave like
that of the AR(p) model. This follows from the fact that the autocorrelation coefficients obey a
difference equation that is generated purely from the autoregressive structure. If q < p the whole
autocorrelation function will behave like that of the AR(p) model. If q p≥ there will be q + 1 -
p initial autocorrelations before the typical AR(p) autocorrelation pattern sets in.
The Wold decomposition of the ARMA(p,q) Process
On dividing both sides of equation (69) by the autoregressive operator,
40
( ) ( ) ,( )t t tLy LL
θ ε ψ εφ
= =
hence,
( ) ( ) ( ) ,L L Lθ φ ψ≡ or
( ) ( )( )21 1 1 21 1 1q p
q pL L L L L Lθ θ φ φ ψ ψ− − − ≡ − − − + + +… … …
In order to derive the jψ coefficients it is simplest to just equate powers of L:
1 1 1
2 2 1 1 2
3 3 1 2 2 1 3
θ ψ φ
θ ψ φψ φ
θ ψ φψ φ ψ φ
− = −
− = − −
− = − − −
1 1q q q qθ ψ φψ φ−− = − − −… if p q>
Hence after p initial values the typical infinite moving average process coefficients also obey the
same difference equation as the autocorrelation coefficients.
Some Examples in Depth: the ARMA(1,1) Process
1 1t t t ty yφ ε θε− −− = −
41
and
20
21 1 1 1 1 1
2
( )
( ) ( ) ( ) ( )
( )
t t
t t t t t t t
E y
E y E y E E
ω ε σ
ω ε φ ε ε ε θ ε
φ θ σ
− − − − −
= =
= = + −
= −
The Yule-Walker equations are:
2 20 1 ( )γ φγ σ θ φ θ σ= + − −
21 0γ φγ θσ= −
1k kγ φγ −= 2k ≥
On solving these equations,
22
0 2
1 2
1
θ θφγ σ
φ
+ − = −
( )( )21 2
1
1
φθ φ θγ σ
φ
− − = −
Hence, 0 1,ρ =
( )( )1 2
1
1 2
φθ φ θρ
θ θφ
− − = + −
and
1 ,k kρ φρ −= 2k ≥
42
Note that since q = p = 1, there are q + 1 - p = 1 preliminary autocorrelations before the typical
AR(1) pattern sets in. The infinite moving average representation is then given by
(70)
1
2 2
1
(1 )(1 ) ,
(1 )(1 ) ,
( ) .
t t
t
it t i
i
y L L
L L L
θ φ ε
θ φ φ ε
ε φ θ φ ε
−
∞
−=
= − −
= − + + +
= + − ∑
…
Hence, 1 ( ) ,ψ φ θ= − 2 ( ),ψ φ φ θ= − … 1( ).kkψ φ φ θ−= − Alternatively it is possible to use the fact
that 1 ,k kψ φψ −= for 2k ≥ and then 1( ).jjψ φ φ θ−= −
An ARMA(2,1) Process
(72) ( ) ( ) ( )31 11 2 14 8 3 ,t t t t ty y y ε ε− − −= + + −
where 2314 8( ) (1 )L L Lφ = − − has roots of 4
3 and -2, while 13( ) [1 ( ) ]L Lθ = − has a root of 3; so
that the process is stationary and invertible.
2( ) ,t tE yε σ=
2 2 21 1 11 1 4 3 12( ) ( ) ,t tE yε ω σ σ σ− = = − = −
( ) 0t t kE yε − = 1k ≥
Hence,
( ) ( ) ( )( )2 231 1 10 1 24 8 3 12 ,γ γ γ σ σ= + + − −
43
( ) ( ) ( ) 231 11 0 14 8 3 ,γ γ γ σ= + −
( ) ( )312 1 04 8 ,γ γ γ= +
and
( ) ( )311 24 8k k kγ γ γ− −= + 2k ≥ .
From the second equation,
( ) ( ) ( ) 25 1 11 08 4 3 ,γ γ σ= −
hence ( ) ( ) 2821 05 15γ γ σ= − and ( ) ( ) 219 2
2 040 15γ γ σ= − . Then,
( ) 224320 2079γ σ= and ( ) 2170
1 2599γ σ= −
Hence 0 1ρ = and 1 0.0559,ρ =− which can be used as initial conditions to obtain
( ) ( )3 14 2
k kk A Bρ = + −
Then 1 = A + B and -.0559 = ( )34 A ( )1
2 ;B− hence A = .4447 and B = .5553 and the
autocorrelation coefficients are,
(73) ( ) ( )3 14 2.4447 .5553 ,k k
kρ = + − k = 0, 1, 2, …
Lag k 1 2 3 4 5 6 7 8 9
kρ .0559 .3610 .0693 .1527 .0642 .0733 .0424 .0381 .0254
The infinite moving average representation is given by,
44
( ) ( )( )2 231 11 23 4 81 1 1L L L L Lψ ψ − ≡ − − + + + …
Equating coefficients,
( ) ( )( ) ( )( ) ( )( ) ( )
1 113 4
3 11 28 4
3 11 2 38 4
3 12 3 48 4
,
0 ,
0 ,
0 ,
ψ
ψ ψ
ψ ψ ψ
ψ ψ ψ
− = − +
= − − +
= − − +
= − − +
( )11 12 ,ψ = − 17
2 48 ,ψ = 113 192 ,ψ = 113
4 768 ,ψ = 1795 3072 ,ψ = 857
6 12288 .ψ =
Lag k 0 1 2 3 4 5 6
kψ 1.0000 -.0833 .3542 .0573 .1471 .0583 .0697
Predictions from ARMA Processes
It is now convenient to consider the problem of finding the best forecast, i.e., minimum MSE
(mean squared error) prediction of t sy + made at time t. In this case the forecast origin is said to
be at time t and the forecast horizon is s. It is usual to consider the minimum mean square error
predictor and to find the linear combination, 1 1 ,tt j t jjy yφ+ −=
= ∑ such that
2
1 11
t
t j t jj
E y yφ+ + −=
−∑
is minimized. The following predictor will be based on all relevant and available information at
45
time t. The prediction of t sy + at time t is expressed as,
(75) ( ), 1 2| , , ,t s t t s t s t t ty E y E y y y y+ + − −= = …
so that Et represents an expectation conditioned on information that is available at time t. This
conditional expectation then defines the minimum MSE predictor with reference to the
information set 1 2, , ,.....t t ty y y− − Then,
(76) ,t t s t sE y y+ = for s = 1, 2, …
and
(77) t t s t sE y y+ += for s = 0, -1, -2, …
For the innovation process tε it is true that
(78) 0t t sE ε + = for s = 1, 2, …
and
(79) t t s t sE ε ε+ += for s = 0, -1, -2, …
46
where, ,t sy is a prediction of t sy + made at time t; then 2,( )t s t sE y y+ − is minimized by
, .t s t t sy E y += Forecasts can be made directly from the ARIMA model, or alternatively from the
infinite moving average or autoregressive representations. From the infinite MA representation,
0,t j t j
jy ψ ε
∞
−=
= ∑ 0 1ψ =
then
(80) ( ) ( )1 1 1 1 1 1t s t s t s s t s t s ty ε ψ ε ψ ε ψ ε ψ ε+ + + − − + + −= + + + + + +… …
hence
(81) 1 1t t s s t s tE y ψ ε ψ ε+ + −= + +…
so that 0
.t s j s t jj
y ψ ε∞
+ + −=
= ∑ The forecast error is , ,t s t s t se y y+= − and is given by,
47
( ), ,
1 1 1 1
1
0
( )
,
t s t s t s
t s t s s s
s
j t s jj
e y y
ε ψ ε ψ ε
ψ ε
+
+ + − − +
−
+ −=
= −
= + + +
= ∑
…
and the MSE of the s step ahead prediction is given by
(83) ( )1
2 2, ,
0( ) .
s
t s t s jj
MSE y Var e σ ψ−
=
= = ∑
which is the simplest method for finding the prediction MSE. However, the actual values of the
predictions are most easily made recursively using the above conditional expectations for the ty
and tε random variables. Recursions can be straightforwardly obtained from the ARMA model
formulations and are best illustrated by some simple examples.
Example 1: AR(1):
Consider the standard AR1) model, 1 .t t ty yφ ε−= + For one step ahead prediction,
1t tE y + = ( )1 ,t t tE yφ ε ++
and hence ,1ty = .tyφ At time t+2, the model is, 2ty + = 1 2 ;t tyφ ε+ ++ and on taking expectations
through the equation, conditional on information at time t:
48
( ) 2,2 ,1 .t t t ty y y yφ φ φ φ= = =
Similarly, since 1 ,t s t s t sy yφ ε+ + − += + it follows that
(84) , , 1
.
t s t s
st
y y
y
φ
φ
−=
=
Since jjψ φ= for the AR(1) process, it follows that the MSE associated with the predictor will
be,
( )
( )( )
12 2
,0
22
2
,
1.
1
sj
t sj
s
MSE y σ φ
φσ
φ
−
=
=
−=
−
∑
It is interesting to note that for very long lead times, as ,s →∞ then 0,t sy + → which is the
unconditional mean of ,ty while the MSE becomes,
( ) ( )2
, 2,
1s t sLIM MSE y σ
σ→∞ =
−
which is the unconditional variance of .ty Since the process is stationary, the value of
information at time t is of limited value when predicting a long way into the future.
Consequently the variance of a forecast a long way into the future is merely the unconditional
49
variance of the process. This is a general property of long term predictions from all stationary
processes; for the stationary and invertible class of ARMA models the effect occurs at an
exponential rate.
The ARMA(1,1) model,
The model is,
(85) 1 1 ,t t t ty yφ ε θε− −= + −
and since 1 1 ,t t t ty yφ ε θε+ += + − it follows that ,1 .t t ty yφ θε= − While at time t+2,
( ),2 ,1 2 1
,1
2
,
,
t t t t t
t
t t
y y E
y
y
φ ε θε
φ
φ φθε
+ += + −
=
= −
In general for the ARMA (1,1) model and for 2,s ≥
( ), , 1 1
, 1
1
,
( ).
t s t s t t s t s
t s
st t
y y E
y
y
φ ε θε
φ
φ φ θε
− + + −
−
−
= + −
=
= −
The method is then easily applied to higher order ARMA(p,q) processes.