Upload
tommaso-proietti
View
215
Download
2
Embed Size (px)
Citation preview
ARTICLE IN PRESS
0167-7152/$ - se
doi:10.1016/j.sp
E-mail addr
Statistics & Probability Letters 78 (2008) 257–264
www.elsevier.com/locate/stapro
Missing data in time series: A note on the equivalence of thedummy variable and the skipping approaches
Tommaso Proietti
Dipartimento S.E.F. e ME.Q, Via Columbia 2, 00133 Rome, Italy
Received 13 March 2006; received in revised form 22 February 2007; accepted 23 May 2007
Available online 16 June 2007
Abstract
This note shows the equivalence of the dummy variable approach and the skipping approach for the treatment of
missing observations in state space models. The equivalence holds when the coefficient of the dummy variable is considered
as a diffuse rather than a fixed effect. The equivalence concerns both likelihood inference and smoothed inferences.
r 2007 Elsevier B.V. All rights reserved.
Keywords: Kalman filter; Smoothing; Influence; Cross-validation
1. Introduction
A well-known result is that estimating a missing observation by skipping the Kalman filter (KF) updatingstep is equivalent to introducing a dummy variable (additive outlier) in the measurement equation, filling themissing value arbitrarily. This result (in different frameworks) appears in a number of papers: Sargan andDrettakis (1974), Bruce and Martin (1989), Ljung (1993). A detailed discussion can be found in Fuller (1996,Section 8.7). However, if the additive outlier is treated as a fixed effect, with zero covariance matrix, thelikelihood is defined differently and a correction has to be computed in the second case, see Gomez et al.(1999). The correction factor is related to the determinantal term of the likelihood and depends in a simplefashion from quantities computed under the model for the complete observations, requiring a single run of theKF and smoothing filter.
To our knowledge, a proof the equivalence of the skipping approach and the dummy approach for thedefinition of the likelihood and for smoothing is not available. This note aims at bridging the gap, providing asimple proof that when the additive outlier is treated as diffuse, with arbitrarily large covariance matrix, thecorrection to the likelihood takes place automatically. This is convenient, as no extra programming effort isnecessary once a programme handling diffuse initial conditions and regression effects has been implemented.
The equivalence is also carried forward to smoothed inferences, concerning the estimation of the states andthe disturbances. The derivation of analytical expressions for the influence of an observation on thesequantities, made in De Jong (1996), is greatly simplified in the dummy variable setup as they depend in asimple fashion on the output of the KF and smoother run on intervention variables.
e front matter r 2007 Elsevier B.V. All rights reserved.
l.2007.05.031
ess: [email protected]
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264258
The plan of the paper is the following: Section 2 introduces the dummy variable approach for stationarystate space models with no regression effects, under fixed and diffuse conditions, and derives the predictionerror decomposition form of the likelihood under the latter. In Section 3 we present the alternative strategy ofhandling missing observations, known as the skipping approach, and prove that the likelihood for this modelis equivalent to the dummy variable one. In Section 4 the equivalence is extended to smoothed estimates of thestates and the disturbances, and measures of influence of an observations are given, which depends in a simpleway on the output of the KF and smoothing filter run on the intervention variable.
2. The Dummy variable approach
Let yt denote a vector stationary time series with N elements; the state space model is
yt ¼ Z tat þ G tet; t ¼ 1; 2; . . . ;T , (1)
atþ1 ¼ T tat þH tet; t ¼ 1; 2; . . . ;T , (2)
with a1�Nða1;s2P1Þ, where a1 and s2P1 denote the unconditional mean and covariance matrix of at, andet�NIDð0; s2IÞ. The system matrices, Z t;G t;T t;H t, are functionally related to a vector of hyperparameters, h.
The Kalman filter (KF) is a well-known recursive algorithm for computing the minimum mean squareestimator of at and its mean square error (MSE) matrix conditional on Y t�1 ¼ fy1; y2; . . . ; yt�1g. Defining
at ¼ EðatjY t�1Þ; MSEðatÞ ¼ s2Pt ¼ E½ðat � atÞðat � atÞ0jY t�1�,
the filter consists of the following recursions:
mt ¼ yt � Z tat; F t ¼ Z tPtZ0t þ G tG
0t,
qt ¼ qt�1 þ m0tF�1t mt; K t ¼ ðT tPtZ
0t þH tG
0tÞF�1t ,
atþ1 ¼ T tat þ K tmt; Ptþ1 ¼ TtPtL0t þH tJ
0t (3)
with Lt ¼ T t � K tZ t and J t ¼ H t � K tG t; mt ¼ yt � EðytjY t�1Þ are the filter innovations, with MSE matrixs2F t. The filter is started off with a1 ¼ 0, P1 ¼ H0H 00 and q0 ¼ 0. The log-likelihood for the model is, apartfrom a constant term,
Lðy1; . . . ; yT ; hÞ ¼ �1
2NT ln s2 þ
XT
t¼1
ln jF tj þ s�2qT
" #, (4)
where qT ¼PT
t¼1m0tF�1t mt.
Suppose that an intervention is included at t ¼ i so that the measurement equation becomes
yt ¼ Z tat þ I tðiÞdþ G tet, (5)
where I tðiÞ is an indicator variable taking value 1 for t ¼ i and 0 elsewhere. For its statistical treatment, the KF(3) at t ¼ i is augmented by the following recursions:
Vþt ¼ I tðiÞI � Z tAþt ,
Aþtþ1 ¼ T tAþt þ K tV
þt ¼ K iI tðiÞ þ LtA
þt ,
Sþt ¼ Sþt�1 þ V0þt F�1t Vþt ,
sþt ¼ sþt�1 þ V0þ
t F�1t mt, (6)
for t ¼ i; . . . ;T with starting conditions: Aþi ¼ 0, Sþi�1 ¼ 0 and sþi�1 ¼ 0. This amounts to apply the KF to theintervention signature I tðiÞI .
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264 259
When d is treated as a fixed effect, the log-likelihood can be written as (Rosenberg, 1973)
�1
2NT ln s2 þ
XT
t¼1
ln jF tj þ s�2 qT � 2s0þ
Tdþ d0SþT d� �" #
.
The MLE of d is thus d ¼ Sþ�1T sþT and the concentrated likelihood is
�1
2NT ln s2 þ
XT
t¼1
ln jF tj þ s�2 qT � s0þ
T Sþ�1T sþT
� �" #.
There is, however, a conceptual difficulty with the fixed effects model, as was clearly pointed out by Bell(1989, p. 408), in that ‘‘the use of an indicator variable lets the mean at a given point be anything while stillassuming that the observation is normal with the same variance and covariances as other observations,whereas omitting observations makes no assumption at all about it’’.
In the sequel, d is treated as a diffuse effect, that is ½CovðdÞ��1 converges to zero in the Euclidean norm (seeDe Jong, 1991), e.g. d�Nð0;kIÞ; k!1; this is equivalent to making no assumption on the covariance of theith observation.
De Jong (1991) has shown that d can be concentrated out of the likelihood function, so that d ¼ Sþ�1T sþT andMSEðdÞ ¼ s2Sþ�1T . The diffuse log-likelihood function is
LDVðy1; . . . ; yT ; hÞ ¼ �1
2NðT � 1Þ ln s2 þ
XT
t¼1
ln jF tj þ ln jSþT j þ s�2 qT � s0þ
T Sþ�1T sþT
� �" #. (7)
This function is the likelihood for a rank T � 1 transformation of the observations, which makes the datainvariant to d.
The following theorem is a restatement of Theorem 2 in De Jong and Penzer (1998).
Theorem 1. The estimate of d and the diffuse LF can be written as
d ¼M�1i ui; MSEðdÞ ¼ s2M�1
i , (8)
LDV ¼ �1
2NðT � 1Þ ln s2 þ
XT
t¼1
ln jF tj þ ln jM ij þ s�2ðqT � u0iM�1i uiÞ
" #, (9)
where ui and M i are the output at t ¼ i of the smoothing filter:
ut ¼ F�1t mt � K 0trt; M t ¼ F�1t þ K 0tN tK t,
rt�1 ¼ Z 0tF�1t mt þ L0trt; N t�1 ¼ Z 0tF
�1t Z t þ L0tN tLt (10)
started with rT ¼ 0 and NT ¼ 0.
Proof. We begin by noting that Vþi ¼ I and Vþt ¼ �Z tLt;iþ1K i for t ¼ i þ 1; . . . ;T with Lt;iþ1 ¼ Lt�1 � � �Liþ1
and Liþ1;iþ1 ¼ I . Hence
sþT ¼XT
t¼i
V0þ
t F�1t mt
¼ F�1i mi � K 0iXT
t¼iþ1
L0t;iþ1Z0tF�1t mt
¼ F�1i mi � K 0iri
¼ ui,
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264260
SþT ¼XT
t¼i
V0þ
t F�1t Vþt
¼ F�1i þ K 0iXT
t¼iþ1
L0t;iþ1Z 0tF�1t Z tLt;iþ1
!K i
¼ F�1i þ K 0iN iK i
¼M i.
Replacing into the expressions for d and (7) yields the result. &
Using a different argument, De Jong and Penzer (1998) show that
yt � Eðytjy1; . . . ; yt�1; ytþ1; . . . ; yT Þ ¼M�1t ut.
The next theorem provides an alternative expression for the likelihood function, based on the one-step-ahead prediction error decomposition. This will prove useful in the comparison with that arising from theskipping approach.
Theorem 2. For model (5), let F t ¼MSEðmtjY t�1Þ, and mt ¼ EðmtjY t�1Þ, where F t ¼ F t þ Vþt Sþ�1t�1 V0þ
t and
mt ¼ mt � Vþt Sþ�1t�1 st�1. Then,
LDV ¼ �1
2NðT � 1Þ ln s2 þ
Xi�1t¼1
ln jF tj þXT
t¼iþ1
ln jF tj þ s�2 qi�1 þXT
t¼iþ1
m0tF�1
t mt
!" #. (11)
Proof. To show that the determinantal part of the LF is as stated we provide the following recursion for jST j:
jSþT j ¼ jSþT�1 þ V
0þ
T F�1T VþT j
¼ jSþT�1jjI þ Sþ�1T�1V0þ
T F�1T VþT j
¼ jSþT�1jjI þ F�1T VþT Sþ�1T�1V0þ
T j
¼ jSþT�1jjFT j�1jFT j.
Iterating this result for t ¼ T � 1;T � 2; . . . ; i þ 1 and recalling that Sþi ¼ F�1i produces
ln jSþT j ¼XT
t¼iþ1
ln jF tj �XT
t¼i
ln jF tj.
Moreover,
qT � s0þ
T Sþ�1T sþT ¼ qT�1 � s0þ
T�1Sþ�1T�1sþT�1 þ m0T F
�1
T mT
which, applied recursively, yields
qT � s0T Sþ�1T sT ¼ qi � s0þ
i Sþ�1i sþi þXT
t¼iþ1
m0tF�1
t mt.
Now, as qi ¼ qi�1 þ m0iF�1i mi and s
0þ
i Sþ�1i sþi ¼ m0iF�1i mi, result (11) follows directly. &
3. The Skipping approach
When the ith observation is missing, the KF is forced to skip the updating step at time t ¼ i, so that
aðmÞiþ1 ¼ T iai; PðmÞiþ1 ¼ T iPiT
0i þH iH
0i, (12)
and qðmÞi ¼ qi�1.
From time i þ 1 on the KF (3) is run with mt; at; qt;K t;Ptþ1;Lt; J t replaced by mðmÞt ; aðmÞt ; qðmÞt ;K ðmÞt ;
PðmÞtþ1;LðmÞt ; J ðmÞt . See Harvey et al. (1998).
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264 261
The log-likelihood function LðmÞ ¼Lðy1; . . . ; yi�1; yiþ1; . . . ; yT ; hÞ is
LðmÞ ¼ �1
2NðT � 1Þ ln s2 þ
Xi�1t¼1
ln jF tj þXT
t¼iþ1
ln jFðmÞt j þ s�2
Xi�1t¼1
m0tF�1t mt þ
XT
t¼iþ1
m0ðmÞt F ðmÞ
�1t m
ðmÞt
!" #.
(13)
Theorem 3. LðmÞ ¼LDV.
Proof. The KF resulting from the skipping approach is related to the full sample KF (3) by the followingequations:
mðmÞt ¼ mt � Vþt Sþ�1t�1 sþt�1; F ðmÞt ¼ F t þ Vþt Sþ�1t�1 V
0þ
t ,
KðmÞt ¼ K t � Aþtþ1S
þ�1t V
0þ
t F�1t ,
aðmÞtþ1 ¼ atþ1 � Aþtþ1Sþ�1t sþt ; PðmÞtþ1 ¼ Ptþ1 þ Aþtþ1Sþ�1t A
0þ
tþ1, (14)
where Aþt ¼ Lt;iþ1K i. These relations hold for t ¼ i þ 1: from (12),
aðmÞiþ1 ¼ aiþ1 � K imi ¼ aiþ1 � Aþiþ1Sþ�1i sþi
and
PðmÞiþ1 ¼ Piþ1 þ K iF iK0i ¼ Piþ1 þ Aþiþ1Sþ�1i A
0þ
iþ1.
Hence
mðmÞiþ1 ¼ Z iþ1a
ðmÞiþ1 ¼ miþ1 � Z iþ1A
þiþ1Sþ�1i sþi ¼ miþ1 � Vþiþ1S
þ�1i sþi
and
F ðmÞiþ1 ¼ Z iþ1PðmÞiþ1Z
0iþ1 þ G iþ1G
0iþ1 ¼ F iþ1 þ Vþ
iþ1Sþ�1i V
0þ
iþ1.
The formula for the gain matrix is obtained noticing that
F ðmÞ�1
iþ1 ¼ F�1iþ1 � F�1iþ1Vþiþ1Sþ�1iþ1 V
0þ
iþ1F�1iþ1
whence
K ðmÞiþ1 ¼ ðT iþ1PðmÞiþ1Z0iþ1 þH iþ1G
0iþ1ÞF
ðmÞ�1
iþ1 ,
¼ ðK iþ1F iþ1 � T iþ1K iF iV0ðmÞ
iþ1ÞFðmÞiþ1,
¼ K iþ1 � Aþiþ2Sþ�1iþ1 V0þ
iþ1F�1iþ1.
In conclusion, mðmÞt ¼ mt and F ðmÞt ¼ F t. &
When d is treated as a fixed effect, the correction that has to be applied to the determinantal part of thelikelihood is �0:5 ln jM ij, which is available from a run of the smoothing filter (10).
4. Influence and deletion diagnostics
In this section we use the previous results in a different perspective. Assuming that the full sample isavailable we aim at computing measures of influence on smoothed inferences. De Jong (1989) proved that thesmoothed estimate of the state at t, ~at ¼ EðatjYT Þ, and its MSE matrix, s2 ~Pt ¼ E½ðat � ~atÞðat � ~atÞ
0jYT �, are
~at ¼ at þ Ptrt�1; ~Pt ¼ Pt � PtN t�1Pt,
where rt�1 and N t�1 are given in the second line of (10).
Now, let ~aðmÞt ¼ EðatjYðiÞT Þ, where Y ðiÞ
T ¼ ðy1; . . . ; yi�1; yiþ1; . . . ; yT Þ is the information at T excluding yi.
Moreover, let ~PðmÞ
t ¼ E½ðat � ~aðiÞt Þðat � ~a
ðmÞt Þ0jY ðmÞT Þ�.
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264262
Theorem 4.
~at ¼ ~aðmÞt þ ðAþt þ PtR
þt�1ÞM
�1i ui,
~Pt ¼ ~PðmÞ
t � ðAþt þ PtR
þt�1ÞM
�1i ðA
þt þ PtR
þt�1Þ0,
where
Rþt�1 ¼ Z 0tF�1t Vþt þ L0tR
þt , (15)
with Aþt ¼ 0 for toi þ 1. Also, for toi, Vþ
t ¼ 0 and Rþt�1 ¼ L0tRþt ¼ L0i;tR
þi�1.
Proof. The orthogonal set fm1; . . . ; mT g is a linear transformation of the set
m1; . . . ; mi�1; mðmÞiþ1; . . . ; m
ðmÞT ; ui
n o.
This set is orthogonal too, since ui and mðmÞt depend only on fmj ; jXig, and
CovðmðmÞt ; uiÞ ¼ Cov mt � Vþt Sþ�1t�1 sþt�1;
XT
i
V0þ
t F�1t mt
!¼ 0; 8t4i.
Thus, applying a standard result in uncorrelated linear projection:
~at ¼ ~aðmÞt þ s�2Covðat � aðmÞt ; uiÞM�1i ui
¼ ~aðmÞt þ s�2Covðat � at þ Aþt Sþ�1t�1 sþt�1; uiÞM�1i ui
¼ ~aðmÞt þ Pt
XT
j¼i
L0t;jZ0jF�1j Vþj þ Aþt
!M�1
i ui
The formula for the MSE matrix is derived similarly from
MSEðatjYT Þ ¼MSEðatjYðiÞ
T Þ � Covðat; uiÞM�1i Covðat; uiÞ
0: &
Hence,
~at � ~aðmÞt ¼ ðA
þt þ PtR
þt�1ÞM
�1i ui (16)
provides the measure of influence of the ith observation on the state estimate at time t. The quantities on theright-hand side are readily available from the augmented KF for model (5); s�2 Covðat; uiÞ ¼ Aþt þ PtR
þ
t�1 isthe leverage of yi on ~at (De Jong, 1996).
We now show that
Rþi�1 ¼ Z 0iM i � T 0iN iK i. (17)
Rþi�1 ¼ Z 0iF�1i þ L0iZ
0iþ1F�1iþ1V
þiþ1 þ � � � þ L0T ;iZ
0T F�1T VþT
¼ Z 0iF�1i � ðL
0iZ0iþ1F�1iþ1Z iþ1K i þ � � � þ L0T ;iZ
0T F�1T ZT LT ;iþ1K iÞ
¼ Z 0iF�1i � L0iN
0iK i
¼ Z 0iF�1i � T 0iN iK i þ Z 0iK
0iN iK i
¼ Z 0iM i � T 0iN iK i
which proves (17). Therefore, we recover the result derived in De Jong (1996), ~ai � ~aðmÞi ¼ PiðZ
0iM i � T 0iN iK iÞ.
Let et ¼ EðetjYT Þ denote the smoothed disturbance. Koopman (1993) shows that
et ¼ G tF�1t mt þ J 0trt.
We are now interested in assessing the influence of the ith observation on the smoothed estimate of thedisturbance et.
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264 263
For this purpose, we denote eðmÞt ¼ EðetjY
ðmÞT Þ, and, taking the expectation of both sides of (1) alternatively
with respect to YT and YðiÞT ,
yt ¼ Z t ~at þ G tet,
yt ¼ Z t ~aðmÞ
t þ G teðmÞ
t ,
we write
Z tð~at � ~aðmÞ
t Þ ¼ �G tðet � eðmÞt Þ; tai,
yi � EðyijYðiÞT ÞÞ ¼ Z ið~ai � ~a
ðmÞi Þ þ G iðei � eðmÞi Þ. (18)
Moreover, from the transition equation (2)
~atþ1 � ~aðmÞtþ1 ¼ T tð~at � ~a
ðmÞt Þ þH tðet � eðmÞt Þ. (19)
Theorem 5. Defining
Eþt ¼ G 0tF�1t Vþt þ J 0tR
þt (20)
the scaled smoothed disturbances of the transition equation are given by
H tðet � eðmÞt Þ ¼ H tEþt M�1
i ui. (21)
Moreover,
H iðei � eðmÞi Þ ¼ H iðG0iM i �H 0iN iK iÞM
�1i ui.
Proof.
H tðet � eðmÞt Þ ¼ ð~atþ1 � ~aðmÞtþ1Þ � T tð~at � ~a
ðmÞt Þ
¼ ½ðAþtþ1 þ Ptþ1Rþt Þ � T tðA
þt þ PtR
þt�1Þ�M
�1i ui
¼ ½K tVþt þ ðT tPtLt þH tJ
0tÞRþt � TtPtðZ
0tF�1t Vþt þ L0tR
þt Þ�M
�1i ui
¼ ½K tVþt þH tJ
0tRþt � T tPtZ
0tF�1t Vþt �M
�1i ui
¼ H tðG0tF�1t Vþt þ J 0tR
þt ÞM
�1i ui.
Note that, for toi, Eþt ¼ J 0tRþ
t . Also, H0ðe0 � eðmÞ0 Þ ¼ H0H00Rþ0M�1
i ui.In order to prove the last statement, we first note that
Rþi ¼ Z 0iþ1F�1iþ1V
þiþ1 þ Liþ1Z
0iþ2F�1iþ2V
þ
iþ2 þ � � � þ L0T ;iþ1Z0T F�1T VþT ,
¼ � ðZ 0iþ1F�1iþ1Z iþ1K i þ Liþ1Z0iþ2F�1iþ2Z iþ2Liþ1K i þ � � � þ L0T ;iþ1Z 0T F�1T ZT LT ;iþ1K iÞ,
¼ �N 0iK i,
which gives
Eþi ¼ G 0iF�1i Vþi þ J 0iR
þi
¼ G 0iF�1i þH 0iR
þi � G 0iK
0iRþi
¼ G 0iM i �H 0iN iK i: &
Theorem 6. For tai the scaled smoothed measurement disturbance is given as follows:
G tðet � eðmÞt Þ ¼ G tEþt M�1i ui. (22)
ARTICLE IN PRESST. Proietti / Statistics & Probability Letters 78 (2008) 257–264264
Proof. From (18)
G tðet � eðmÞt Þ ¼ � Z tð~at � ~aðmÞ
t Þ
¼ � Z tðAþt þ PtR
þt�1ÞM
�1i ui
¼ ½�Z tAþt � Z tPtðZ
0tF�1t Vþt þ L0tR
þt Þ�M
�1i ui
¼ ½Vþ
t � ðF t � G tG0tÞF�1t Vþt � Z tPtL
0tRþt �M
�1i ui
¼ G tðG0tF�1t Vþt þ J 0tR
þt ÞM
�1i ui
¼ G tEþt M�1
i ui.
In the derivation we used the easily established relation: Z tPtL0t ¼ �G tJ
0t. &
Theorem 7. The influence of yi on the smoothed disturbance:
et � eðmÞt ¼ Eþt M�1
i ui. (23)
Proof. The proof is immediate, as the matrix ½G 0tH0t�0 has full column rank. &
In conclusion, the computation of the influence for et and at via the forward recursion (19), depend onquantities readily available from a run of the smoothing filter on the dummy variable. Eþt provides a measureof leverage of yi on et.
5. Conclusions
The paper has showed the equivalence between the skipping approach and the dummy variable (additiveoutlier) approach for both likelihood and smoothed inferences, and use the latter for deriving suitablealgorithms for computing deletion diagnostics. The extension to the class of nonstationary state space modelsis available from the author.
References
Bell, W.R., 1989. Discussion of the paper by Bruce and Martin: leave-k-out diagnostics for time series. J. Roy. Statist. Soc. Ser. B 51,
408–409.
Bruce, A.G., Martin, R.D., 1989. Leave-k-out diagnostics for time series. J. Roy. Statist. Soc. Ser. B 51, 363–424 (with discussion).
De Jong, P., 1989. Smoothing and interpolation with the state space model. J. Amer. Statist. Assoc. 84, 1085–1088.
De Jong, P., 1991. The diffuse Kalman filter. Ann. Statist. 19, 1073–1083.
De Jong, P., 1996. Fixed interval smoothing. Working paper, London School of Economics.
De Jong, P., Penzer, J., 1998. Diagnosing shocks in time series. J. Amer. Statist. Assoc. 93, 796–806.
Fuller, W.A., 1996. Introduction to Statistical Time Serie Wiley Series in Probability and Statistics. Wiley, New York.
Gomez, V., Maravall, A., Pena, D., 1999. Missing observations in ARIMA models: skipping approach versus additive outlier approach.
J. Econometrics 88, 341–363.
Harvey, A.C., Koopman, S.J., Penzer, J., 1998. Messy time series. In: Fomby, T.B., Hill, R.C. (Eds.), Advances in Econometrics, vol. 13.
JAI Press, New York.
Koopman, S.J., 1993. Disturbance smoother for state space models. Biometrika 80, 117–126.
Ljung, G.M., 1993. on outlier detection in time series. J. Roy. Statist. Soc. Ser. B 55, 559–567.
Rosenberg, B., 1973. Random coefficient models: the analysis of a cross-section of time series by stochastically convergent parameter
regression. Ann. Econom. Social Measurement 2, 399–428.
Sargan, J.D., Drettakis, E.G., 1974. Missing data in an autoregressive model. Internat. Econom. Rev. 15, 39–58.