50
Lecture 4: Estimation of ARIMA models Florian Pelgrin University of Lausanne, ´ Ecole des HEC Department of mathematics (IMEA-Nice) Sept. 2011 - Dec. 2011 Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 1 / 50

Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Embed Size (px)

Citation preview

Page 1: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Lecture 4: Estimation of ARIMA models

Florian Pelgrin

University of Lausanne, Ecole des HECDepartment of mathematics (IMEA-Nice)

Sept. 2011 - Dec. 2011

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 1 / 50

Page 2: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Road map

1 IntroductionOverviewFramework

2 Sample moments

3 (Nonlinear) Least squares methodLeast squares estimationNonlinear least squares estimationDiscussion

4 (Generalized) Method of momentsMethods of moments and Yule-Walker estimationGeneralized method of moments

5 Maximum likelihood estimationOverviewEstimation

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 2 / 50

Page 3: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Introduction Overview

1. Introduction1.1. Overview

Provide an overview of some estimation methods for linear time seriesmodels :

Sample moments estimation ;

(Nonlinear) Linear least squares method ;

Maximum likelihood estimation

(Generalized) Method of moments

Other methods are available (e.g., Bayesian estimation or Kalmanfiltering through state-space models) !

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 3 / 50

Page 4: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Introduction Overview

Broadly speaking, these methods consist in estimating the parametersof interest (autoregressive coefficients, moving average coefficients,and variance of the innovation process) as follows :

Optimizeδ

QT (δ)

s.t. (nonlinear) linear equality and inequality constraints

where QT is the objective function and δ is the vector of parametersof interest.

Remark : (Nonlinear) Linear equality and inequality constraints mayrepresent the fundamentalness conditions.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 4 / 50

Page 5: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Introduction Overview

Examples

(Nonlinear) Linear least squares methods aims at minimizing the sumof squared residuals ;

Maximum likelihood estimation aims at maximizing the (log-)likelihood function ;

Generalized method of moments aims at minimizing the distancebetween the theoretical moments and zero (using a weighting matrix).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 5 / 50

Page 6: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Introduction Overview

These methods differ in terms of :

Assumptions ;

”Nature” of the model (e.g., MA versus AR) ;

Finite and large sample properties ;

Etc.

High quality software programs (Eviews, SAS, Splus, Stata, etc) areavailable ! However, it is important to know the estimation options(default procedure, optimization algorithm, choice of initialconditions) and to keep in mind that all these estimation techniquesdo not perform equally and do depend on the nature of the model...

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 6 / 50

Page 7: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Introduction Framework

1.2. Framework

Suppose that (Xt) is correctly specified as an ARIMA(p,d,q) model

Φ(L)∆dXt = Θ(L)εt

where εt is a weak white noise (0, σ2ε ) and

Φ(L) = 1− φ1L− · · · − φpLp with φp 6= 0

Θ(L) = 1 + θ1L + · · ·+ θqLq with θq 6= 0.

Assume that1 The model order (p,d, and q) is known ;2 The data has zero mean ;3 The series is weakly stationary (e.g., after d-difference transformation,

etc).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 7 / 50

Page 8: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Sample moments

2. Sample moments

Let (X1, · · · ,XT ) represent a sample of size T from acovariance-stationary process with

E [Xt ] = mX for all t

Cov [Xt ,Xt−h] = γX (h) for all t

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 8 / 50

Page 9: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Sample moments

If nothing is known about the distribution of the time series,(non-parametric) estimation can be provided by :

Mean

mX ≡ XT =1

T

T∑t=1

Xt

Autocovariance

γX (h) =1

T

T−|h|∑t=1

(Xt − XT )(Xt+|h| − XT )

or

γX (h) =1

T − |h|

T−|h|∑t=1

(Xt − XT )(Xt+|h| − XT ).

Autocorrelation

ρX (h) =γX (h)

γX (0).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 9 / 50

Page 10: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Least squares estimation

3. (Nonlinear) Least squares method3.1. Least squares estimation

Consider the linear regression model (t = 1, · · · ,T ) :

yt = x ′tb0 + εt

where (yt , x′t) are i.i.d. (H1), the regressors are stochastic (H2), x ′t is

the tth-row of the X matrix, and the following assumptions hold :1 Exogeneity (H3)

E [εt |xt ] = 0 for all t;

2 Spherical error terms (H4)

V [εt |xt ] = σ20 for all t;

Cov [εt , εt′ | xt , xt′ ] = 0 for t 6= t ′

3 Rank condition (H5)

rank [E (xtx′t)] = k.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 10 / 50

Page 11: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Least squares estimation

Definition

The ordinary least squares estimation of b0, bols , defined from thefollowing minimization program

bols = argminb0

T∑t=1

ε2t

= argminb0

T∑t=1

(yt − x ′tb0

)2is given by

bols =

(T∑

t=1

xtx′t

)−1( T∑t=1

xtyt

).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 11 / 50

Page 12: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Least squares estimation

Proposition

Under H1-H5, the ordinary least squares estimator of b is weakly consistent

bolsp−→

T→∞b0

and is asymptotically normally distributed

√T(bols − b0

)`→ N

(0k×1, σ

20E[xtx

′t

]−1)

.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 12 / 50

Page 13: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Least squares estimation

Example : AR(1) estimation

Let (Xt) be a covariance-stationary process defined by thefundamental representation (|φ| < 1) :

Xt = φXt−1 + εt

where (εt) is the innovation process of (Xt).

The ordinary least squares estimation of φ is defined to be :

φols =

(T∑

t=2

x2t−1

)−1( T∑t=2

xt−1xt

)

Moreover

√T(φols − φ

)`→ N

(0, 1− φ2

).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 13 / 50

Page 14: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Nonlinear least squares estimation

3.2. Nonlinear least squares estimation

Definition

Consider the nonlinear regression model defined by (t = 1, · · · ,T ) :

yt = h(x ′t ; b0) + εt

where h is a nonlinear function (w.r.t. b0). The nonlinear least squaresestimator of b0, bnls , satisfies the following minimization program :

bnls = argminb0

T∑t=1

ε2t

= argminb0

T∑t=1

(yt − h(x ′t ; b0)

)2.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 14 / 50

Page 15: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Nonlinear least squares estimation

Example : MA(1) estimation

Let (Xt) be a covariance-stationary process defined by thefundamental representation (|θ| < 1) :

Xt = εt + θεt−1

where (εt) is the innovation process of (Xt).

The ordinary least squares estimation of θ, which is defined by

θols = argminθ

T∑t=2

(xt − θεt−1)2 ,

is not feasible (since εt−1 is not observable !).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 15 / 50

Page 16: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Nonlinear least squares estimation

Example : MA(1) estimation (cont’d)

Conditionally on ε0,

x1 = ε1 + θε0 ⇔ ε1 = x1 − θε0

x2 = ε2 + θε1 ⇔ ε2 = x2 − θε1

⇔ ε2 = x2 − θ(x1 − θε0)...

xt−1 = εt−1 − θεt−2 ⇔ εt−1 =t−2∑k=0

(−θ)kxt−1−k + (−θ)t−1ε0.

The (conditional) nonlinear least squares estimation of θ is defined by(assuming that ε0 is negligible)

θnls = argminθ

T∑t=2

(xt − θ

t−2∑k=0

(−θ)kxt−1−k

)2

and√

T(θnls − θ

)`→ N

(0, 1− θ2

).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 16 / 50

Page 17: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Nonlinear least squares estimation

Example : Least squares estimation using backcasting procedure

Consider the fundamental representation of a MA(q) :

Xt = ut

ut = εt + θ1εt−1 + · · ·+ θqεt−q

Given some initial values δ(0) =(θ(0)1 , · · · , θ

(0)q

)′,

Compute the unconditional residuals ut

ut = Xt !

Backcast values of εt for t = −(q − 1), · · · ,T using the backwardrecursion

εt = ut − θ(0)1 εt+1 − · · · − θ(0)

q εt+q

where the q values for εt beyond the estimation sample are set tozero : εT+1 = · · · = εT+q = 0, and ut = 0 for t < 1.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 17 / 50

Page 18: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Nonlinear least squares estimation

Example : Least squares estimation using backcasting procedure(cont’d)

Estimate the values of εt using a forward recursion

εt = ut − θ(0)1 εt−1 − · · · − θ

(0)q εt−q

Minimize the sum of squared residuals using the fitted values εt :

T∑q+1

(Xt − εt − θ1εt−1 − · · · − θq εt−q)2

and find δ(1) =(θ(1)1 , · · · , θ

(1)q

)′.

Repeat the backcast step, forward recursion, and minimizationprocedures until the estimates of the moving average part converge.

Remark : The backcast step can be turned off (ε−(q−1) = · · · = ε0 = 0)and the forward recursion is only used.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 18 / 50

Page 19: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Nonlinear) Least squares method Discussion

3.3. Discussion

(Linear and Nonlinear) Least squares estimation are often used inpractise. These methods lead to simple algorithms (e.g., backcastingprocedure for ARMA models) and can often be used to initialize more”sophisticated” estimation methods.

These methods are opposed to exact methods (see likelihoodestimation).

Nonlinear estimation requires numerical optimization algorithms :

Initial conditions ;

Number of iterations and stopping criteria ;

Local and global convergence ;

Fundamental representation and constraints.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 19 / 50

Page 20: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Methods of moments and Yule-Walker estimation

4. (Generalized) Method of moments4.1. Methods of moments and Yule-Walker estimation

Definition

Suppose there is a set of k conditions

ST − g (δ) = 0k×1

where ST ∈ Rk denotes a vector of theoretical moments , δ ∈ Rk is avector of parameters, and g : Rk → Rk defines a (bijective) mappingbetween ST and δ.The method of moments estimation of δ, δmm, is defined to be the valueof δ such that

ST − g(δmm

)= 0k×1

where ST is the estimation (empirical counterpart) of ST .

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 20 / 50

Page 21: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Methods of moments and Yule-Walker estimation

Example 1 : MA(1) estimation

Let (Xt) be a covariance-stationary process defined by thefundamental representation (|θ| < 1) :

Xt = εt + θεt−1

where (εt) is the innovation process of (Xt).

One has

ST − g(δ) =

(γX (0)− (1 + θ2)σ2

ε

γX (1)− θσ2ε

)= 02×1

where δ = (θ, σ2ε )′ and ST = (γX (0), γX (1))′ = (E

[X 2

t

], E [XtXt−1])

′.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 21 / 50

Page 22: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Methods of moments and Yule-Walker estimation

The method of moment estimation of δ, δmm, solves

ST − g(δmm) =

(γX (0)− (1 + θ2

mm)σ2ε,mm

γX (1)− θmmσ2ε,mm

)= 02×1

Therefore (using the constraint |θ| < 1)

φmm =1−

√1− 4ρ2

X (1)

2ρX (1)

and

σ2ε,mm =

2γX (0)

1−√

1− 4ρ2X (1)

.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 22 / 50

Page 23: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Methods of moments and Yule-Walker estimation

Example 2 : Yule-Walker estimation

Let (Xt) be a causal AR(p). The Yule-Walker equations write

E

Xt−k

Xt −p∑

j=1

φjXt−j

= 0 for all k = 0, · · · , p

or

ST − g(δ) = 0(p+1)×1 ⇔

γX (0)− φ′γp = σ2ε

γp − Γpφ = 0p×1

where γp = (γX (1), · · · , γX (p))′ = (E [XtXt−1] , · · · , E [XtXt−p])′,

ST = (γX (0), γ′p)′, φ = (φ1, · · · , φp)

′, δ = (φ′, σ2ε )′.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 23 / 50

Page 24: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Methods of moments and Yule-Walker estimation

The method of moment estimation of (φ′, σ2ε )′ solves

φmm = Γ−1p γp

and

σ2ε,mm = γX (0)− φ′mmγp.

Question : Difference(s) with ordinary least squares estimation of anAR(p) ?

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 24 / 50

Page 25: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Methods of moments and Yule-Walker estimation

Discussion

It is necessary to use theoretical moments that can be well-estimatedfrom the data (or given the sample size).

Method of moment estimation depends on the mapping g−1 andespecially the ”smoothness” of the inverse g−1 (need to avoid thatthe inverse map has a ”large derivative”).

Summarizing the data through the sample moments incurs a loss ofinformation—the main exception is when the sample moments aresufficient (in a statistical sense).

Instead of reducing a sample of size T to a ”sample” of empiricalmoments of size k, one may reduce the sample to more ”empiricalmoments” than there are parameters...the so-called generalizedmethod of moments.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 25 / 50

Page 26: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

4.2. Generalized method of moments

Intuition :

Consider the linear regression model in an i.i.d. context

yt = x ′tb0 + εt

The exogeneity assumption E [εt | X ] implies the k momentsconditions (for t = 1, · · · ,T )

E[xt(yt − x ′tb0)

]= 0k×1.

Using the empirical counterpart (analogy principle)

1

T

T∑t=1

xt(yt − x ′tb0) = 0k×1

This is a system of k equations with k unknowns (just-identified)

bols =

(T∑

t=1

xtx′t

)−1( T∑t=1

xtyt

).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 26 / 50

Page 27: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

More generally, suppose that there exists a set of instrumentszt ∈ Rp (with p ≥ k)—the instruments being (i) uncorrelated withthe error terms and (ii) correlated with the explanatoryvariables—such that

E[zt(yt − x ′tb0)

]= 0p×1 ⇔ E [h(zt ; δ0)] = 0p×1.

Using the empirical counterpart (analogy principle)

1

T

T∑t=1

zt(yt − x ′tb0) = 0p×1 ⇔ 1

T

T∑t=1

h(zt ; δ0) = 0p×1.

This system is overidentified for p > k. How can we find anestimation of b0 ?

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 27 / 50

Page 28: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

The answer is provided by the generalized method of moments, i.e. anestimation method based on estimating equations that imposes thenullity of the expectation of a vector function of the observations andthe parameters of interest b0.

The basic idea is to choose a value for δ such that the empiricalcounterpart is as close as possible to zero.

”As close as possible to zero” means that one minimizes the

distance between 1T

T∑t=1

zt(yt − x ′tb0) and 0p×1 using a weighting

matrix (say, WT ).

Two issues1 Choice of instruments ;2 Choice of the (optimal) weighting matrix.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 28 / 50

Page 29: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

Definition

Consider the set of moment conditions

E [h(zt , δ0)] = 0p×1

where δ0 ∈ Rk , p ≥ k, and h is a p-dimensional (nonlinear) linear functionof the parameters of interest and depends on (i.i.d.) observations. Thegeneralized method of moments estimator of δ0 associated with WT

(symmetric positive definite matrix) is a solution to the problem

minδ

[1

T

T∑t=1

h(zt ; δ)

]′WT

[1

T

T∑t=1

h(zt ; δ)

].

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 29 / 50

Page 30: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

Definition

The generalized method of moments estimator of δ0 solves the first-orderconditions[

1

T

T∑t=1

∂h(zt ; δgmm)

∂δ

]′WT

[1

T

T∑t=1

h(zt ; δgmm)

]= 0k×1.

This estimator is consistent and asymptotically normally distributed.

Remark : The first-order conditions correspond to k linear combinations ofthe moments conditions.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 30 / 50

Page 31: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

This estimator depends on Wt . There exists a best GMM estimator,i.e. an optimal weighting matrix.

The optimal weighting matrix is given by (i.i.d. context)

W ∗T = V−1 [h(zt , δ0)] ≡ E−1

[h(zt , δ0)h(zt , δ0)

′]and the sample analog is[

1

T

T∑t=1

h(zt , δ0)h(zt , δ0)′

]−1

.

It does depend on δ0 !

How can we proceed in practise ?

Two-step GMM or iterated GMM method ;One-step method (Continuously updating estimator, exponential tiltingestimator, empirical likelihood estimator, etc).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 31 / 50

Page 32: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

Example : AR(1) estimation

Let (Xt) be a covariance-stationary process defined by thefundamental representation (|φ| < 1) :

Xt = φXt−1 + εt .

Since (εt) is the innovation process of (Xt), one has the followingmoment conditions

E [xt−kεt ] = 0 for k > 0

⇒ Past values xt−1, · · · , xt−h can be used as instruments

Let zt denote the set of instruments

zt = (xt−1, · · · , xt−h)′.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 32 / 50

Page 33: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

The moment conditions write

E [h(zt , δ0)] = 0h×1

where

h(zt , δ0) = zt(xt − φ1xt−1).

The empirical counterpart is defined to be

1

T

T∑t=h+1

xt−1...

xt−h

(xt − φ1xt−1) = 0h×1.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 33 / 50

Page 34: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

The 2S-GMM estimator of φ is a solution to the problem[T∑

t=h+1

∂h(zt ; δgmm)

∂δ

]′Ω−1

T

[T∑

t=h+1

h(zt ; δgmm)

]= 0h×1.

where Ω−1T is a first-step consistent estimate of the inverse of the

variance-covariance matrix of the moments conditions and it accountsfor the dependence of the moment conditions.

Remark : In a first step, one solves[T∑

t=h+1

∂h(zt ; δ1,gmm)

∂δ

]′IT

[T∑

t=h+1

h(zt ; δ1,gmm)

]= 0h×1.

and then compute Ω−1T (δ1,gmm) ≡ Ω−1

T .

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 34 / 50

Page 35: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

(Generalized) Method of moments Generalized method of moments

Discussion

Choice of instruments (optimal instruments ?) and of the weightingmatrix ;

Bias-efficiency trade-off ;

Numerical procedures can be cumbersome !

Other methods can be written as a GMM estimation.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 35 / 50

Page 36: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Overview

5. Maximum likelihood estimation5.1. Overview

Consider the linear regression model (t = 1, · · · ,T ) :

yt = x ′tb0 + εt

where (yt , x′t) are i.i.d., the regressors are stochastic, x ′t is the

tth-row of the X matrix, and

εt | xt ∼ i .i .d .N (0, σ2ε ).

It follows that

yt | xt ∼ i .i .d .N (x ′tb0, σ2ε ).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 36 / 50

Page 37: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Overview

One observes a sample of (conditionally) i.i.d. observations (yt , x′t).

These observations are realizations of random variables with a knowndistribution but unknown parameters.

The basic idea is to maximize the probability of observing theserealizations (given a known parametric distribution with unknownparameters).

The probability of observing these realizations is provided by the jointdensity (or pdf) of the observations (or evaluated at the observations).

This joint density is the likelihood function, with the exception that itis a function of the unknown parameters (and not theobservations !).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 37 / 50

Page 38: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

5.2. Estimation

Definition

The exact likelihood function of the linear regression model is defined to be

L(y , x ; δ0) =T∏

t=1

Lt(yt , xt ; δ0)

where Lt is the (exact) likelihood function for observation t

Lt(yt , xt ; δ0) = fYt |Xt(yt | xt ; δ0)× fXt (xt).

where fYt |Xt(respectively, fXt ) is the (conditional) probability density

function of Yt | Xt (respectively, of Xt). Accordingly, the exactlog-likelihood function is defined to be

`(y , x ; δ0) =T∑

t=1

`t(yt , xt ; δ0) ≡T∑

t=1

logLt(yt , xt ; δ0).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 38 / 50

Page 39: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

The exact likelihood function is given by

L(y , x ; δ0) =T∏

t=1

fYt |Xt(yt | xt ; δ0)

T∏t=1

fXt (xt)

The second right-hand side term is often neglected (especially if fX does notdepend on δ0) and one considers the conditional likelihood function

L(y | x ; δ0) =T∏

t=1

fYt |Xt(yt | xt ; δ0)

=T∏

t=1

(σ2

ε2π)−1/2

exp

(− 1

2σ2ε

(yt − x ′tb0)

2)

=(σ2

ε2π)−T/2

exp

(− 1

2σ2ε

T∑t=1

(yt − x ′tb0)

2

)and the conditional log-likelihood function is given by

`(y | x ; δ0) = −T

2log(σ2

ε )−T

2log(2π)−− 1

2σ2ε

T∑t=1

(yt − x ′tb0)

2.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 39 / 50

Page 40: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

Definition

The maximum likelihood estimator of δ0 in the linear regression model,which is the solution of the problem

maxδ

L(y | x ; δ) ⇔ maxδ

`(y | x ; δ),

is asymptotically unbiased and efficient, and normally distributed

√T(δml − δ0

)`→ N

(0k×1, I

−1F

)where IF is the Fisher information matrix.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 40 / 50

Page 41: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

Example : AR(1) estimation

Consider the following AR(1) representation (|φ| < 1)

Xt = φXt−1 + εt .

where εt or εt |Xt−1 ∼ i .i .d .N(0, σ2

ε

)Therefore

Xt | Xt−1 = xt−1 ∼ N(φxt−1, σ

)X1 ∼ N

(0,

σ2ε

1− φ2

)Accordingly,

fXt |Xt−1(xt | xt−1; δ0) =

(σ2

ε 2π)−1/2

exp

(− 1

2σ2ε

(xt − φxt−1)2

)fX1(x1; δ0) =

(σ2

ε

1− φ22π

)−1/2

exp

(−1− φ2

2σ2ε

x21

).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 41 / 50

Page 42: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

Exact (log-) likelihood function

The exact likelihood function is given by

L(x ; δ0) = L1(x1; δ0)×T∏

τ=2

Lτ (xτ | xτ−1; δ0)

= fX1(x1; δ0)×T∏

τ=2

fXτ |Xτ−1(xτ | xτ−1; δ0)

=

(σ2

ε

1− φ22π

)−1/2

exp

(−1− φ2

2σ2ε

x21

)×(σ2

ε2π)− T−1

2 exp

(− 1

2σ2ε

T∑t=2

(xt − φxt−1)2

).

The exact log-likelihood is

`(x ; δ0) = −T

2log(2π)− 1

2log

(σ2

ε

1− φ2

)− 1− φ2

2σ2ε

x21

−T − 1

2log(σ2

ε )−1

2σ2ε

T∑t=2

(xt − φxt−1)2.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 42 / 50

Page 43: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

Conditional (log-) likelihood function

The conditional likelihood function is given by

L(x | x−1; δ0) =T∏

τ=2

Lτ (xτ | xτ−1; δ0)

=T∏

τ=2

fXτ |Xτ−1(xτ | xτ−1; δ0)

=(σ2

ε 2π)−T−1

2 exp

(− 1

2σ2ε

T∑t=2

(xt − φxt−1)2

).

The conditional log-likelihood is

`(x | x−1; δ0) = −T − 1

2log(2π)− T − 1

2log(σ2

ε )

− 1

2σ2ε

T∑t=2

(xt − φxt−1)2.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 43 / 50

Page 44: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

The two methods lead to different maximum likelihood estimates

δcml 6= δeml

The conditional maximum likelihood estimator of φ is also theordinary least squares estimator of φ

φcml = φols

This also holds for any AR(p).

The Yule-Walker, conditional maximum likelihood and (ordinary) leastsquares estimator are asymptotically equivalent.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 44 / 50

Page 45: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Maximum likelihood estimation Estimation

Discussion

Generally the maximum likelihood estimates are built as if theobservations are Gaussian, though it is not necessary that (Xt) isGaussian when doing the estimation.

If the model is misspecified, the only difference (to some extent...) isthat estimates may be less efficient. In this case, one can use thepseudo- or quasi-maximum likelihood estimation.

In the case of linear models, there might be not so much gain of usingexact maximum likelihood against conditional likelihood method (orleast squares method).

Maximum likelihood estimation often requires nonlinear numericaloptimization procedures.

Estimation of the Fisher information matrix can be cumbersome(especially in finite samples).

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 45 / 50

Page 46: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Appendix

Appendix : ML estimation of a MA(q)

Step 1 : Writing ε = (ε1, · · · , εT )′

One has

ε1−q = ε1−q

... =...

ε0 = ε0

ε1 = x1 − θ1ε1 − · · · − θqε1−q

ε1 = x2 − θ1ε2 − · · · − θqε2−q

... =...

εT = xT − θ1εT − · · · − θqεT−q

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 46 / 50

Page 47: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Appendix

Therefore

ε = NX + Zε∗

=(

Z N)( ε∗

X

)with X = (x1, · · · , xT )′, ε∗ = (ε1−q, ε2−q, · · · , ε−1, ε0)

′, and

N =

(0q×T

A1(θ)

),Z =

(Iq

A2(θ)

)where A1(θ) is a lower triangular matrix (T × T ), A2(θ) is a T × qmatrix, and Iq is the identity matrix of order q.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 47 / 50

Page 48: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Appendix

Step 2 : Determination of the joint density of ε

The joint probability density function of ε is defined to be

(2πσ2

ε

)−T+q2 exp

[− 1

2σ2ε

(NX + Zε∗)′ (NX + Zε∗)

].

Step 3 : Determination of the marginal density of X

The marginal density of X is defined to be

(2πσ2

ε

)−T2(detX ′X

)− 12 exp

[− 1

2σ2ε

(NX + Z ε∗)′ (NX + Z ε∗)

]with ε∗ = −(Z ′Z )−1Z ′NX .

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 48 / 50

Page 49: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Appendix

Step 4 : Determination of the log-likelihood funcion

The log-likelihood function is defined to be

` (x ; δ) = −T

2log(2π)− T

2log(σ2

ε )−T

2log(detX ′X

)− S(θ)

2σ2ε

where δ = (θ′, σ2ε )′ and

S(θ) = (NX + Z ε∗)′ (NX + Z ε∗) .

Step 5 : Determination of σ2ε,ml

Using the first-order condition w.r.t. σ2ε , one has

σ2ε,ml =

S(θml

)T

.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 49 / 50

Page 50: Lecture 4: Estimation of ARIMA models - unice.frmath.unice.fr/~frapetti/CorsoP/Chapitre_4_IMEA_1.pdf · Estimate the values of ... Univariate time series Sept. 2011 - Dec. 2011 20

Appendix

Step 6 : Determination of the concentrated log-likelihood funcion

The concentrated log-likelihood function is defined to be

`∗(x ; θ) = −T log [S(θ)]− log[detX ′X

].

Step 7 : Determination of θml

θml solves

minθ

T log [S(θ)] + log

[detX ′X

].

Remarks :

1 Least squares methods only focus on the first right-hand side term ofthe log-likelihood function. Exact methods account for bothright-hand side terms.

2 This method generalizes to ARMA(p,q) models.

Florian Pelgrin (HEC) Univariate time series Sept. 2011 - Dec. 2011 50 / 50