Spectral Filtering for Trend Estimationscienze-como.uninsubria.it/mdonatelli/Publications/...Spectral Filtering for Trend Estimation1 Marco Donatelli, Alessandra Luati, Andrea Martinelli

Spectral Filtering for Trend Estimation1

Marco Donatelli, Alessandra Luati, Andrea Martinelli

University of Insubria, University of Bologna, University of Insubria

Abstract

This paper deals with trend estimation at the boundaries of a time seriesby means of smoothing methods. After deriving the asymptotic properties ofsequences of matrices associated with linear smoothers, two classes of asym-metric filters that approximate a given symmetric estimator are introduced:the reflective filters and antireflective filters. The associated smoothing ma-trices, though non-symmetric, have analytically known spectral decompo-sition. The paper analyses the properties of the new filters and considersreflective and antireflective algebras for approximating the eigensystems oftime series smoothing matrices. Based on this, a thresholding strategy for aspectral filter design is discussed.

Keywords:Smoothing, asymmetric filters, current analysis, matrix algebras, filterdesign.

1. Introduction

Let us consider the time series model,

yt = µt + εt, t = 1, . . . , n,

where yt is the observed time series, µt is the trend component, also termedthe signal, and εt is the noise, or irregular, component. The signal µt canbe a random or deterministic function of time whereas the most commonassumption for the noise εt is that it follows a zero mean stationary processeither white noise or/and Gaussian. The interest is in estimating µt based on

1This is a preprint of a paper in Linear Algebra Appl., 473 (2015), pp. 217–235

Preprint submitted to Linear Algebra and its Applications July 16, 2015

the available information, with the aim of separating permanent movementsfrom transitory oscillations. To this purpose, smoothing methods such aslocal polynomial regression or kernel regression are often applied (see [12],[14], [18] and [20] among the others; see also [16], for a general treatmentof signal extraction). These methods provide linear estimators of the trendbased on weighted average of the observations, µt =

∑hj=−hwjyt+j for t =

h + 1, . . . , n − h, where {w−h, . . . , w0, . . . , wh} is a symmetric filter, wj =w−j, satisfying the unbiasedness condition with respect to a constant trend,∑h

j=−hwj = 1.Estimates for the trend at the first and last h time points are obtained

with asymmetric and time varying filters. Notice that estimates for the trendin correspondence to the last h time points are crucial in current analysis.Moreover, statistical agencies often face the problem of approximating a givensymmetric filter with a set of asymmetric ones ([5]). A typical example isprovided by the Census X11 and X11/X12 ARIMA software (see [6] and[13]), where the central trend is estimated with the symmetric Hendersonfilters ([11]), while surrogate filters for the trend at the boundaries are used(see [8] and [9]).

In practice, a filtering matrix S is applied to the time series of observationscollected in an n-dimensional vector y to obtain the vector of estimates µ =Sy. Specifically, S ∈ Rn is a matrix filter that assumes the following form

S =

SP Oh×(n−2h)SI

Oh×(n−2h) SF

(1)

where SI ∈ R(n−2h)×n is the Toeplitz matrix associated with the symmetricfilter:

SI =

w−h . . . w0 . . . wh. . . . . . . . . . . . . . .

w−h . . . w0 . . . wh

,where w−j = wj, that is the starting filters are symmetric, while SP ∈ Rh×2h

and SF ∈ Rh×2h are the matrices associated with the asymmetric filters.The use of a symmetric filter implies that past and future observations havethe same relevance for trend estimation. The same assumption requires thatSP = JhSFJ2h, where Jk denote the flip matrix such that (Jk)ij = 1 ifi+ j = k + 1 and zero otherwise, for i, j = 1, . . . , k.

2

Several methods have been proposed to treat the problem of current anal-ysis based on asymmetric filters, that is to derive or specify SF , see the recentdiscussion in [17]. When the spectral properties of the matrices associatedwith the linear trend estimators are of interest, two classes of filters deserveattention. These are the classes of reflective filters and antireflective filters.The relevant property that the matrices associated with reflective or antire-flective filters shares is that they have analytically known eigenvalues andeigenvectors. The latter can be interpreted as the latent components of anytime series that the filter smooths through the corresponding eigenvalues.One can show (see [15]) that eigenvectors associated with eigenvalues (inmodulus) close to one represent low frequency components that form thelong run trend of the series. On the other hand, eigenvectors associatedwith eigenvalues close to zero represent short period components usually as-sociated with noise. Hence, eigenvalue-based inferential procedures can bedeveloped. Moreover, matrices belonging to the reflective and antireflectivealgebras (and their eigensystems) can be used to approximate non-symmetricsmoothing matrices (and their eigensystems) arising in many context, e.g. inlocal polynomial regression, as previously considered in [15].

The contribution of this paper is three-fold and covers theoretical, method-ological and numerical aspects. First, we prove general asymptotic results onthe eigenvectors and eigenvalues of sequences of smoothing matrices of vary-ing order. Secondly, we introduce the smoothing matrices associated withreflective and antireflective filters and discuss their properties as current trendestimators. We then numerically evaluate their capability of approximatingthe unknown latent components of a given smoothing matrix. Finally, onthe practical side, we discuss an eigenvalue-based filtering strategy.

The structure of the paper is the following. Section 2 illustrates theconstructing principles of the reflective and antireflective concurrent filters.Section 3 contains the main theoretical results of the paper, i.e. the asymp-totic properties of sequences {Sn} both in the case of a generic smoothingmatrix S and in the case of a smoothing matrix that belongs to the reflectiveor antireflective algebra (section 3.1). The approximation of S with a matrixbelonging to the reflective or antireflective algebra is considered in section3.2, along with a numerical analysis based on different asymmetric filters andboundary conditions. A new strategy for a time-domain eigenvalue-based fil-ter design or improvement is illustrated in Section 4: the choice of a tuningparameter is discussed and simulation experiments are illustrated. Section 5contains an empirical analysis and section 6 concludes the paper.

3

2. Reflective and antireflective filters for current analysis

A common strategy for designing asymmetric filters consists in apply-ing a symmetric filter to the observations extrapolated according to somecriterion. The resulting asymmetric filter is then the convolution of the sym-metric filter and the extrapolation matrix. In practice, the h future, missingobservations are obtained as yf = Ayp where yp is the vector containingthe last available observations and A is the extrapolating matrix, see [17]for examples. According to the extrapolating criterion, the dimensions of Aand consequently of yp may vary. In the case of reflective and antireflectivefilters, the extrapolating matrices to obtain

yf = (yn+1, yn+2, . . . , yn+h)′

have the following form.

1. Reflective filter (RF). The extrapolating matrix is A = Jh, i.e.

yf = (yn, yn−1, . . . , yn−h+1)′ .

2. Antireflective filter (AF) We assume A =

−Jh∣∣∣∣∣∣∣

2...2

i.e. the ex-

trapolated values are obtained by imposing a global symmetry aroundyn:

yn+` = yn − (yn−` − yn) = 2yn − yn−`, ` ∈ {1, . . . , h} .

Figure 1 shows the extrapolated series using reflective and antireflectivefilters and the estimate of the series at the last data point obtained usingreflective and antireflective boundary conditions.

In practice, using RF, the last row of the SF in (1) is the vector (0, . . . , 0,wh, wh + wh−1, . . . , w0 + w1)

′, while using AF it is the vector (0, . . . , 0, w0 +2w1). The spectral decomposition of the smoothing matrices (1) associatedwith RF and AF is explicitly known and can be computed in O(n log n)operations (see Section 3.1). In Section 3.2 we will also show that bothRF and AF produce a good approximation of asymmetric filters that haverecently been introduced in [17]. These are the direct asymmetric filters(DAF), where the missing observations are extrapolated by forecasting from

4

1.4 1.6 1.8 2.0

0.0

0.5

1.0

Time

y

Time series asymmetric filtering

Series

R forecast

AR forecast

R filtered

AR filtered

Figure 1: Extrapolating and smoothing with reflective and antireflective boundary condi-tions

a polynomial of the same order of the one that generates the trend and thusproduce unbiased trend estimates, and the generalised asymmetric filters(GAF), that are minimum mean square revision error filters developed forreducing the variability of the DAF, at the expenses of the bias. This isachieved by reducing the degree of the polynomial trend that the filter canreproduce, with respect to the degree that is assumed for the trend at theboundaries.

3. Asymptotic spectral decomposition

In this section, we determine the asymptotic properties of the eigenvaluesand eigenvectors of the sequence of matrices {Sn}, having the same structureof S and order n.

Specifically, we will show that the eigenvalues are distributed as a uniformsampling of the symbol

z(x) =h∑

j=−h

wjeijx, (2)

5

that is the transfer function of the filter, that z can be simply written asz(x) = w0 + 2

∑hj=1wj cos(jx), with z(x) ∈ [−1, 1], x ∈ [−π, π].

Let F be the n dimensional discrete Fourier transform matrix with entries

F =

[1√n

exp

(i(i− 1)(j − 1)2π

n

)]ni,j=1

, i =√−1. (3)

We define the set of circulant matrices C = {M ∈ Cn×n | M = FΛFH

where Λ is a diagonal matrix }. We denote a matrix in C as Cn(θ), whereθ is a continuous function in [−π, π], when Λ = diag(θ(x)), with xi = (i −1)2π/n − π, for i = 1, . . . , n. We recall that F is a unitary matrix, so thatthe eigenvalues of Cn(θ) are θ(xi), i = 1, . . . , n.

Subsequently, a generic matrix n × n is denoted by M , with genericelements mi,j, i, j = 1, . . . , n, and with λj(M) and σj(M), for j = 1, . . . , n,its eigenvalues and singular values respectively. The spectral norm is denoted

by ‖M‖ = σ1(M), the Frobenius norm by ‖M‖F =√∑n

i,j=1 |mi,j|2 and the

trace-norm by ‖M‖1 =∑n

j=1 σj(M) (see [3]).We first prove that the sequences of matrices Sn and Cn(z), where z is

the transfer function of the filter, have the eigenvalues equally distributedand that their eigenvalue distribution is z. The result is a consequence of thefollowing remark and the results in [7] and is stated in Theorem 1.

Remark 1. For n large enough, Sn − Cn(z) has rank at most 2h and aconstant number of nonzero elements located only at the corners and inde-pendent of n. It follows that

‖Sn − Cn(z)‖F = O(1), n→∞, (4)

Theorem 1. Let Sn and z be the matrix and the function defined, respectivelyin 1 and 2. For all continuous complex function with compact support f , itholds

limn→∞

1

n

n∑j=1

f(λj(Sn)) =1

2π

∫ π

−πf(z(t))dt.

Proof. From Remark 1

‖Sn − Cn(z)‖1 ≤ 2hσ1(Sn − Cn(z)),

6

furthermore, for all M , σ1(M) ≤ ‖M‖F , so σ1(Sn − Cn(z)) = O(1). Then

‖Sn − Cn(z)‖1 = O(1), (5)

as n→∞.Moreover, ‖Sn−Cn(z)‖ and ‖Cn(z)‖ are uniformly bounded by a constant

independent of n. Therefore, from Theorem 3.4 in [7] it follows that {Sn}and {Cn(z)} are equally distributed (in the sense of eigenvalues) and thethesis follows from

limn→∞

1

n

n∑j=1

f(λj(Cn(z))) =1

2π

∫ π

−πf(z(t))dt.

As a consequence of Remark 1, (5), and Theorem 3.5 in [7], we can deducethe following stronger result.

Theorem 2. For all ε > 0

# {λj(Sn) /∈ [−1− ε, 1 + ε]× i[−ε, ε], j = 1, . . . , n } = O(1), n→∞.

Proof. From our hypothesis on the coefficients wj it holds z(x) ∈ [−1, 1]and hence λj(Cn(z)) ∈ [−1, 1]. Finally, recalling that ‖Sn − Cn(z)‖ and‖Cn(z)‖ are uniformly bounded by a constant independent of n, from (5),and Theorem 3.5 in [7], {Sn} is strongly clustered at [−1, 1] and the thesisfollows.

With a similar approach we analyse the behavior of the eigenvectors ofSn: we say that the eigenvectors of a matrix sequence {Mn}, where Mn is

a n × n matrix, are distributed like the sequence of unitary vectors {q(n)k },

where q(n)k ∈ Cn for k = 1, . . . , n, i.e., ‖q(n)

k ‖2 = 1, if the discrepancies

r(n)k = ‖Mnq

(n)k −

((q

(n)k )HMnq

(n)k

)q(n)k ‖2

are strongly clustered at zero, in the sense that for all ε > 0

#{k ∈ {1, . . . , n} : r

(n)k > ε

}= O(1), n→∞.

We note that the discrepancies provide a measure of how much the {q(n)k }

behave like the eigenvectors of Mn. The following result follows directlyfrom (4).

7

Theorem 3. For {Sn} the eigenvectors are distributed like the Fourier vec-tors (the columns of the Fourier matrix F in (3)).

Proof. Denoting by mk the k-th column of M , if ‖M‖F = O(1) then ‖mk‖2are strongly clustered at zero. Indeed, it holds

O(1) = ‖M‖2F =n∑k=1

‖mk‖22 ≥∑

{k:‖mk‖2>ε}

‖mk‖22 ≥ ε2 ·#{k : ‖mk‖2 > ε}

and hence #{k ∈ {1, . . . , n} : ‖mk‖2 > ε} = O(1). Therefore, the thesisfollows from ‖SnF − Fdiag(FHSnF )‖2F = O(1), where diag(M) is the n× ndiagonal matrix such that diag(M)i,j = mi,j if i = j and zero otherwise.Fixing Λ = FHCn(z)F , from (4) we have

‖SnF − Fdiag(FHSnF )‖F ≤ ‖SnF − FΛ‖F + ‖FΛ− Fdiag(FHSnF )‖F≤ ‖Sn − Cn(z)‖F + ‖Λ− FHSnF‖F= O(1).

Remark 2. Since z(x) is a real even function, we can approximate {Sn} us-ing the set of matrices diagonalized by the discrete cosine or sine transforms,introduced here below, i.e. based on reflective and antireflective, rather thancirculant, matrices.

3.1. Spectral decomposition of reflective and antireflective matrices

Let U be the n dimensional discrete cosine transform matrix with entries

U =

[√2− δi,1n

cos

((i− 1)(2j − 1)π

2n

)]ni,j=1

, δj,1 =

{1 j = 1,0 j 6= 1.

It is known that U is orthogonal (U ′U = I). Moreover, for every n dimen-sional vector v, the matrix vector multiplication Uv can be computed inO(n log(n)) real operations.

The spectral decomposition of S in the case of RF is equal to

S = U DU ′, (6)

8

where D = diag(z(x)), with xi = (i − 1)π/n, for i = 1, . . . , n and z is thesymbol defined in (2).

From (6) U ′S = DU ′ and consequently we have U ′Se1 = DU ′e1, with etdenoting the t-th vector of the canonical basis. Therefore we deduce that theeigenvalues [D]i,i of S are given by Di,i = [U ′(Se1)]i / [U ′e1]i, i = 1, . . . , n.Hence, the eigenvalues of S can be obtained by taking a discrete cosinetransform of the first column of S.

Let Q be the sine transform matrix of order n− 2 with entries

Q =

√2

n− 1

[sin

(ijπ

n− 1

)]n−2i,j=1

.

Then the antireflective transform can be defined by the matrix (see [1])

T =

0p Q Jp

0

, (7)

where

pj =

√n(2n− 1)

6(n− 1)

(1− j − 1

n− 1

),

for j = 1, . . . , n. We note that ‖p‖ = 1.The spectral decomposition of S in the case of AF is

S = T diag(z(x))T−1, (8)

with x defined as xj+1 = jπ/(n − 1) for j = 0, . . . , n − 2, and xn = 0. Theeigenvalues of S can be computed in O(n log n) real operations resorting tothe discrete sine transform (see [2]).

The good approximation property of S in (8) is evident from the structureof T . Indeed from (7) we have that the first and the last eigenvector of S arean uniform sampling of a linear function and the corresponding eigenvalueis z(0) =

∑hj=−hwj = 1. This is the direct consequence of the fact that

antireflective boundary conditions were proposed to preserve the continuityof the first derivative [19] and hence linear signals.

3.2. Approximation properties of algebra matrices

In the general case when S is not diagonalizable using fast trasforms, onestrategy is to approximate S with a matrix generated by the same symbol

9

and belonging to the reflective or antireflective algebra, based on results onmatrix spectrum perturbation, such as the Bauer-Fike theorem, as consideredin [15] for reflective matrices.

Theorem 4 (Bauer-Fike). If Y = XDX−1 with D = diagi=1,...,n(ξi), then,for each eigenvalue λ of S there exists an index i such that

|λ− ξi| ≤ µ(X)‖S − Y ‖,

where ‖ · ‖ is an absolute matrix norm (e.g. the spectral norm) and µ(X) =‖X‖‖X−1‖ is the conditioning number of X.

A drawback of the Bauer-Fike theorem is the lack of accuracy with respectto the estimates. As a matter of fact, the theorem gives an upper bound forthe size of the perturbation. We then introduce the following quantity toevaluate the goodness of the approximation of S by Y rather than ‖S−Y ‖2,as implicit in Theorem 4,

δ(S − Y ) = maxj=1,...,n

(|λ(S)↓j − λ(Y )↓j |). (9)

where the vector λ(M)↓ contains the modulus of the eigenvalues of M , takenin non-increasing order. This implies that the modulus of the eigenvaluesdecreases as long as the frequency increases.

Based on (9), we perform a numerical comparison between the spectrumof the matrix S and the spectrum of the approximating matrix belonging tothe circulant, reflective or antireflective algebra, that we denote, respectivelyby SC , SR and SA. The results for n = 100 on direct asymmetric filters(DAF) and on generalised asymmetric filters (GAF) with cubic trend andquadratic reproduction constraints (CQ), quadratic trend and linear repro-duction constraints (QL), see [17], are in Table 1, which reports the valuesof δ(S − Y ) for varying Y ∈ {SC , SR, SA}. The results show that the antire-flective matrix gives the best approximation for all the filters. The reflectiveapproximation is still better than the circulant. Notice that the results donot change when n increases since S − Y is of rank 2h.

Figure 2 plots the pointwise differences between the ordered modulus ofthe eigenvalues, i.e. the vectors sy = |λ(S)↓ − λ(Y )↓| for Y ∈ {SR, SA} andS associated with the QL filter.

At the low frequencies, associated with the trend and represented in thepicture by small values of n, the antireflective approximation is pointwise

10

QL CQ DAFδ(S − SC) 0.0774 0.1263 0.1438δ(S − SR) 0.0276 0.0770 0.0947δ(S − SA) 0.0101 0.0400 0.0593

Table 1: δ(S − Y ) for varying Y and S for n = 100.

0 20 40 60 80 100

10−6

10−4

10−2

100

Reflective

Antireflective

Figure 2: sy for varying Y , S associated with the QL filter and n=100.

better than the reflective one. This suggests a thresholding operation in thetime domain that will be illustrated in the following section. From Figure 2,one can also notice the well known theoretical result that while SR has onlyone eigenvalue equal to one and to an eigenvalue of S, SA has two eigenvaluesequal to those of S.

4. Spectral filter design

The results of the preceding sections are applied for a filter design intime domain. The aim is to obtain estimates with smaller variance and al-most equal bias than those produced by S. The need for further smoothingis motivated in practical applications when the filtering procedure producesripples in the trend estimates that may lead to the wrong detection of turn-ing points (see [4]). Of course, the constructive principles of the originalsymmetric filter are violated in order to obtain a smoother trend.

The method consists of modifying S so that n − k high frequency noisycomponents that the filter is not capable of eliminating are given smallerweight. This is done by defining an application that maps Sn = V ΛV −1 ina new filtering matrix S

[k]n , that produces a further smoothing. This kind of

11

map can be obtained from the following transformation

S[k]n = V Φ[k]ΛV −1,

where Φ[k] = diagi=1,...,n(φ[k]i ) and the φ

[k]i are specific functions of k that

act on the eigenvalues of Λ (see [10]) by smoothing them. For example, theclassical hard thresholding, i.e. the high frequencies cut off, can be achievedby

φ[k]i =

{1, i ≤ k,0, i > k.

(10)

This means that we have only to choose the parameter k.In general, we require only that the φ

[k]i are near to 1 for small i and near

to 0 for big i, in this way we can preserve low frequencies (high eigenvalues)and cut off high frequencies (low eigenvalues). The filter (10) can be replacedwith any function with similar properties. For instance, two options are thefollowing discrete functions

φ[k]i (α) =

(atan(α(i− k))

π+

1

2,

otherwise

φ[k]i (α) =

k + α− i2α

,

where α is a natural number such that α < min{k, n− k}.Remark 3. We notice that also in these cases the choice of the parameterk is crucial. Indeed, if k is too big, then the noise is not filtered enough, viceversa if k is too small some elements of the signal can be deleted.

Essentially, the choice of k is a further balancing of the trade-off betweenbias and variance of the filter. The trend in the interior is made smootherwithout sensibly increasing the bias. There are several options regardinghow to choose k. One of them, consists of selecting k or equivalently ξk thatminimises the distance of the eigenvalue distribution of z with that of theideal low pass filter having first k eigenvalues equal to one and last n − kequal to zero. In other words, we look for k such that

f(k) = ‖i(k) − ξ‖2 (11)

is minimum, where i(k) is an n dimensional vector with first k coordinatesequal to one and the remaining equal to zero, whereas ξ = [ξ1, ξ2, ..., ξn]′.

12

The function f(k) =∑k

i=1(1− ξ)2 − ξ2i can be written as f(k) = f(k − 1) +(1− ξk)2 − ξ2k = f(k − 1) + (1− 2ξk) and therefore reaches its minimum forξk = 0.5.

This strategy is equivalent to finding the cut–off frequency that minimizesthe distance between the transfer functions of the symmetric filter and of theideal low–pass filter ∫ π

−π|I(ν)− z(ν)|2dν

where I(ν) = 1 for ν < 2πp

and I(ν) = 0 otherwise. The equivalence is basedon the relation between time and frequency domain. In fact, for a fixed cut–off frequency ν = νk, the cut–off time k = νkn

πis obtained with a precision

that increases as long as n is large. For instance, if we are given monthlydata and are interested in removing 10-month cycles that can be wronglyinterpreted as turning points of the trend curve, then we may replace byzeros all the eigenvalues smaller than ξk with k = 2n

10.

Finally, we would like to remark that whenever the interest is in thesmoothness of the new estimator rather than in the exact value of k, a graph-ical inspection method may be appropriate. Having plotted the eigenvaluedistribution, a suitable cut-off eigenvalue may be directly viewed. If thechoice of k is not related to formal inferential procedure (e.g. restrictions onthe bias) this method may work well, as the following example shows.

4.1. Simulated example: trend estimation

We consider the simulated series y = µ+ε, where µ ∈ R101 is the uniformsampling of the function 3x2 + 2 sin(12x), x ∈ [0, 1.5], and ε ∈ R101 withrandom entries normally distributed. Figure 3 shows y with the trend esti-mates obtained by applying S associated with the 13–terms local quadraticregression filter with GAF for the extremes and S

[k]n = V Φ[k]ΛV −1 with φ

[k]i

defined in (10), to get m and m(k), for k = 5, 10, 15. The choice of k = 10 ismotivated by the following graphical argument. Figure 4 shows the compo-nents of V −1y compared to those of V −1µ. The latter components decreasefast as long as n increases, see Figure 4. When the noise is considered, thehigh frequency components of V −1y are emphasised as well, see Figure 4. Agood choice of k should allow one to separate the pure trend from the noisycomponents.

Fig. 4 shows that the first ten components of V −1y are about equal tothose of V −1µ, which suggest to take k = 10. This is however the ideal

13

0 20 40 60 80 100−4

−2

0

2

4

6

8

10

µ

y

Sy

S[k]

y

0 20 40 60 80 100−4

−2

0

2

4

6

8

10

µ

y

Sy

S[k]

y

(a) (b)

0 20 40 60 80 100−4

−2

0

2

4

6

8

10

µ

y

Sy

S[k]

y

(c)

Figure 3: — Series, — (light) trend, - - m, · · · m(k) for (a): k = 10 optimum, (b): k = 5,(c): k = 15.

0 10 20 30 40 50 60 70 80 90 10010

−6

10−4

10−2

100

102

Figure 4: * components of V −1y, and ◦ coefficients of V −1µ

14

0 20 40 60 80 100−0.5

0

0.5

1

1.5

µ

y

Sy

S[k]

0 20 40 60 80 100−0.5

0

0.5

1

1.5

µ

y

Sy

S[k]

y

(a) (b)

0 20 40 60 80 100−0.5

0

0.5

1

1.5

µ

y

Sy

S[k]

(c)

Figure 5: — Series, — (light) trend, - - m, · · · m(k) for (a): k = 10 optimum, (b): k = 5,(c): k = 15.

0 20 40 60 80 100−0.5

0

0.5

1

1.5

0 20 40 60 80 100−0.5

0

0.5

1

1.5

(a) (b)

Figure 6: ◦ ◦ ◦ trend, · · · m, — m(k) for k = 10 (a): WN errors. (b): AR(1) errors

15

case when µ is known. The graph illustrates that the first ten componentsof V −1y have a decreasing behavior, thus leading to conclude that they arenot affected from noise. From the tenth component on, the modulus of eachcoefficient seems to stabilise around the noise level. Such a criterion can betherefore used to choose the value of the parameter k.

To assess the sensitivity of the method when dependent data are consid-ered, we have run the simulations for the same trend as above but wherethe errors follow a zero mean Gaussian autoregressive parameter of order 1(AR(1) process), εt = 0.8εt−1 + ηt, where η is a zero mean Gaussian process.The results are illustrated by figure (5) which shows 100 trend estimates ob-tained by hard thresholding (k = 5, 10, 15). It is evident that for k = 10, 15the estimates are slightly affected by the persistence of the error term, thoughthe overall results remain satisfactory. To further emphasise this point, fig-ure (6) shows the different trend estimates obtained in the case when theerror is uncorrelated (a) and when there is dependence (b). The increase ofvariability in the second case is evident.

Remark 4. On the computational viewpoint, the spectral decomposition ofS requires O(n3) operations as well as the computation of m. Once the lattervalue is obtained, the estimation of m(k) just requires O(n2) arithmetic oper-ations. It is therefore possible to graphically evaluate k by comparing plotsbased on a grid of values of k without increasing the overall computationalcost.

We conclude this section by emphasising a relevant point: since the spec-tral decomposition of S can be computationally intensive, the idea whichfollows from section 3.2 is that of replacing S with its approximation belong-ing to the reflective or antireflective algebra. Such an approximation enablesthe use an algorithm that requires only O(n log n) real operations.

4.2. Simulated example: band–pass filtering

The same approach used for smoother-trend estimation can be appliedto band-pass filtering. In that case, the thresholding matrix Φ[k] selects theeigenvalues corresponding to a fixed frequency band.

As an example, let us consider the simulated (monthly) series representedin the top left panel of Figure 7, that is the realisation of a sinusoidal processwith two periodic components of period 6 and 12, respectively. The peri-odogram of the series against a frequency in radians (top-right panel) is notimmediate to interpret, due to the high level of noise that contaminates the

16

-40

-20

2520151050

0

20

40

60

-60

Years

Simulated sinusoidal processValue of the process

0.1

0.2

3.53.02.52.01.51.00.50.0

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0.0

Frequencies

Periodogram of the series

Periodogram

-2.5

-2.0

2520151050

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

Years

Smoothed sinusoidal process

Value of the process

0.1

0.2

3.53.02.52.01.51.00.50.0

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0.0

Frequencies

Periodogram of the series

Periodogram

Figure 7: Example on the filtering of one periodic component

3.53.02.52.01.51.00.50.00.0

0.2

0.4

0.6

0.8

1.0

1.2

threshold = 17

threshold = 83

threshold = 158

threshold = 309

Amplitude gain function with Henderson filter (h=3)

3.53.02.52.01.51.00.50.00.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

threshold = 17

threshold = 83

threshold = 158

threshold = 309


3.53.02.52.01.51.00.50.00.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

threshold = 17

threshold = 83

threshold = 158

threshold = 309


3.53.02.52.01.51.00.50.00.0

0.2

0.4

0.6

0.8

1.0

1.2

threshold = 17

threshold = 83

threshold = 158

threshold = 309


Figure 8: Gain function of filters with different h and cut–off

17

series. In particular, while the peak corresponding to the 6-month period canbe detected, this is not the case for the seasonal frequency ω = 2π/12. Givena local linear regression filter, w, we construct the associated filtering matrixwith reflective boundary conditions. Then, we diagonalise S = UDU ′ andconstruct S[k] = UDkU

′, where Dk preserves only the k-th eigenvalue, wherek = 2n/12, which is the eigenvalue corresponding to the seasonal frequency

(φ[k]i = δik, with k = 2n/12). The bottom left panel represents the filtered

series; the periodogram of the filtered series is plotted in the bottom rightpanel. Figure 8 shows the gain function of the reflective approximation of alocal linear regression filter of amplitude ω and different cut–off.

5. Empirical analisys

In this section we examine the monthly beer production in Australia andmeasured in megalitres on logarithmic scale, in the time interval January1956–August 1995 (source: Australian Bureau of Statistics). This seriesserves to our illustrative purposes, since it is strongly seasonal and non–stationary, in particular in mean, as we can see by the plot in panel (a) ofFigure 9.

We have initially smoothed the series with a reflective approximation of a3–term local quintic regression filter that preserve the polynomial of the 5–thdegree, the graphical representation of the smoothed series and the amplitudegain function of the filters are represented in panel (a) and (b) of figure 9,respectively. It is clear from the graphics that this is not a low–pass filter,indeed the smoothed series is quite equal to the original one and the gainfunction decreases slowly.

We now apply the filter design method introduced in Section 4, with theaim to construct new filters starting from this one. First of all, we constructa trend filter: from Figure 9 (d) we see that choosing k = 12, we cut thefrequencies greater than 0.8 and this allows to cut 12–terms seasonality aswell as shorter periodicities. In this way we get a filter with the gain functionrepresented with the black line in the upper–left panel of Figure 11.

In the same way, we construct a filter for removing the six–months period-icity. The gain functions of the filter constructed in this way are representedin the upper–left panel of Figure 11. In the upper–right panel of the fig-ure are represented the (central part) of the smoothed series with filters of6–period and 12–period seasonality and, finally in the second line of the wehave represented the residuals of the series without trend and seasonality

18

Years (Jan 1956 - Aug 1995)

2000199519901985198019751970196519601955

5.0

4.5

4.0

Smoothed series

Series

Log of megalitres

Series of the monthly production of beer in Australia

Frequencies

3.53.02.52.01.51.00.50.0

1.0

0.8

0.6

0.4

0.2

0.0

Gain

Amplitude gain function of w

(a) (b)

0 100 200 300 400 5004

4.5

5

5.5

serie

Filter

Trend

0 10 20 30 40 5010

−3

10−2

10−1

100

101

102

103

(c) (d)

Figure 9: (a) Series and initial smoothing. (b) Gain function of the filter. (c) Smoothedseries. (d) Coefficients of V −1y.

and the related periodogram, that shows how the residuals do not have thesecomponents.

Next, we compare the results obtained for the same filter, but with re-flective and antireflective boundary conditions. In these cases, we recall thatthe spectral decomposition of RF and AF can be computed in O(n log(n)),while those of the GAF filter requires O(n3) arithmetic operations.

In Figure 10 (a), we note that both AF and RF produce a modified Skwhich gives the same results of S in the middle part of the series. Indeed, aslightly better approximation to S is given by Sk when AF is used, see Figure2. Concerning the approximation at the boundaries, we notice that AF ismuch more reactive than RF and so is GAF. This is evidenced in Figure10 (b), where one may note that AF exactly reconstructs the last pointof the series. On the other hand the RF gives a lesser weight to the lastobservation thus resulting in a smoother trend estimate also at the boundary

19

0 100 200 300 400 5004

4.5

5

5.5

serie

GAF

Reflective

Anti

6420-2-4-6

0.0

0.2

0.4

0.6

0.8

1.0

-0.2

q = 6

q = 5

q = 4

q = 3

q = 2

q = 1

q = 0

Henderson symmetric and DAF weights

6420-2-4-6

0.0

0.2

0.4

0.6

0.8

1.0

-0.2

q = 6

q = 5

q = 4

q = 3

q = 2

q = 1

q = 0

Henderson symmetric and DAG weights

6420-2-4-6

0.0

0.2

0.4

0.6

0.8

1.0

-0.2

q = 6

q = 5

q = 4

q = 3

q = 2

q = 1

q = 0

Henderson symmetric and AR weights

6420-2-4-6

0.0

0.1

0.2

0.3

0.4

0.5

-0.1

q = 6

q = 5

q = 4

q = 3

q = 2

q = 1

q = 0

Henderson symmetric and R weights

(a) (b)

Figure 10: (a) Filtered series for Sk, RF and AF . (b) Examples of truncation

of the series. Concluding, for a better approximation of S, antireflectivefilters are recommended. If the interest is in current estimation, a moreconservative filter such as RF is appropriate.

As we have already remarked, the amplitude gain of the filters that weget by approximation is represented in Figure 11. The same figure showsthe smoothed seasonal component, the residuals of the series without trendand seasonality, along with their periodogram. It is evident that the ap-proximated estimators perform their tasks in a satisfactory manner, since noevidence of seasonality can be envisaged from the periodogram.

6. Concluding remarks

Two classes of asymmetric filters for current trend estimations were in-troduced, the reflective and anti-reflective filters, which approximate a givensymmetric weighted average. The resulting smoothing matrix has analyti-cally known spectral decomposition and can be used both as a smoothingmatrix itself and as an approximating matrix when different asymmetric fil-ters are considered.

The problem of approximating the matrices associated with trend filtersfor current analysis by matrices belonging to the circulant or reflective algebrawas considered in [15]. The present paper has generalized the approach takenthere in two directions. On the one hand, a numerical evaluation of the size ofthe approximation is provided; on the other hand, the family of antireflectivealgebra was introduced. The latter possesses a continuity property which

20

0.0

0.2

1.21.00.80.60.40.20.0

0.4

0.6

0.8

1.0

Amplitude gain function of the seasonality filters

Frequencies

Threshold 153-163

Threshold 74-84

Threshold 469-475

Gain

-0.1513

-0.1025

198219801978197619741972

-0.0537

-0.0050

0.0438

0.0925

0.1413

-0.2000

Years

Estimated seasonalities with truncated reflective filters

12-period

6-period

Log of monthly megalitres produced

-0.25

-0.20

2000199519901985198019751970196519601955

-0.15

-0.10

-0.05

0.00

0.05

0.10

-0.30

0.15

0.20

Years

Plot of the residual

Log of monthly megalitres produced

0.05

3.53.02.52.01.51.00.50.0

0.10

0.15

0.20

0.25

0.30

0.00

Plot of the periodogram of residuals

Frequencies

Figure 11: Gain function and analysis with seasonality filters

does not characterize the reflective algebra and reveals to be more useful forspectral approximating purposes. Conversely, the reflective filters have betterstatistical properties for direct estimation of the trend at the boundaries ofthe series. In fact, the real time antireflective filter is concentrated on the lastavailable observation. To give a practical advice, we would recommend theuse of reflective filters for estimation purposes and the use of antireflectivefilters for approximation scopes.

The paper has also contributed to the literature of smoothing matricesby deriving the asymptotic properties of eigenvalues and eigenvectors of se-quences of generic smoothing matrices. The relevance of these results liein the fact that, in general, smoothing matrices that do not belong to thereflective or antireflective (and circulant) algebras have eigenvalues whichare neither analytically known nor even real. On a practical viewpoint, theknowledge of the eigenvalues of a smoothing matrix has enabled a time do-main design of a filter that may be used for band-pass filtering or for improv-ing the smoothing properties of a given estimator.

References

[1] A. Arico, M. Donatelli, J. Nagy, and S. Serra Capizzano. The anti-reflective transform and regularization by filtering. In Numerical Linear

21

Algebra in Signals, Systems, and Control., volume 80 of Lecture Notes inElectrical Engineering, pages 1–21. Springer Verlag, Berlin, 2011.

[2] A. Arico, M. Donatelli, and S. Serra-Capizzano. Spectral analysis of theanti-reflective algebra. Linear Algebra Appl., 428(2-3):657–675, 2008.

[3] R. Bhatia. Matrix analysis, volume 169 of Graduate Texts in Mathemat-ics. Springer-Verlag, New York, 1997.

[4] E.B. Dagum and A. Luati. A cascade linear filter to reduce revisions andturning points for real time trend-cycle estimation. Econometric Reviews,28:40–59, 2009.

[5] E.B. Dagum and A. Luati. Asymmetric filters for trend-cycle estimation.In W.R. Bell, S.H. Holan, and T.S. McElroy, editors, Economic TimeSeries: Modeling and Seasonality, pages 213–230. Chapman & Hall, CRC,Boca Raton, Fl, 2012.

[6] D.F. Findley, B.C. Monsell, W.R. Bell, M.C. Otto, and B. Chen. Newcapabilities and methods of the x12-arima seasonal adjustment program.Journal of Business and Economic Statistics, 16:127–176, 1998.

[7] L. Golinskii and S. Serra-Capizzano. The asymptotic properties of thespectrum of nonsymmetrically perturbed Jacobi matrix sequences. J. Ap-prox. Theory, 144(1):84–102, 2007.

[8] A.G. Gray and P.J. Thomson. Surrogate henderson filters in x-11. Aus-tralian and New Zealand Journal of Statistics, 43:385–392, 2001.

[9] A.G. Gray and P.J. Thomson. On a family of finite moving-average trendfilters for the ends of series. Journal of Forecasting, 21:125–149, 2002.

[10] P.C. Hansen, J.G. Nagy, and D.P. O’Leary. Deblurring images, volume 3of Fundamentals of Algorithms. Society for Industrial and Applied Math-ematics (SIAM), Philadelphia, PA, 2006. Matrices, spectra, and filtering.

[11] R. Henderson. Note on graduation by adjusted average. Transaction ofthe Actuarial Society of America, 17:43–48, 1916.

[12] Fan J. and Gjibels I. Local Polynomial Modelling and its Applications.Monograhph on Statistics and Applied Probability. Chapman and Hall,New York, 1996.

22

[13] D. Ladiray and B. Quenneville. Seasonal adjustment with the x-11method. Lecture Notes in Statistics, Springer-Verlag, New York:127–176,2001.

[14] C. Loader. Local regression and likelihood. Statistics and Computing.Springer-Verlag, New York, 1999.

[15] A. Luati and T. Proietti. On the spectral properties of matrices associ-ated with trend filters. Econometric Theory, 26:321–354, 2010.

[16] D.S.G. Pollock. Handbook of Time Series Analysis,Signal Processingand Dynamics. Signal Processing and its Applications. Academic Press,London, 1999.

[17] T. Proietti and A. Luati. Real time estimation in local polynomialregression, with applications to trend-cycle analysis. Annals of AppliedStatistics, 2:1523–1553, 2008.

[18] David Ruppert, M. P. Wand, and R. J. Carroll. Semiparametric re-gression, volume 12 of Cambridge Series in Statistical and ProbabilisticMathematics. Cambridge University Press, Cambridge, 2003.

[19] S. Serra-Capizzano. A note on antireflective boundary conditions andfast deblurring models. SIAM J. Sci. Comput., 25(4):1307–1325 (elec-tronic), 2003/04.

[20] M.P. Wand and M.C. Jones. Kernel Smoothing. Monographs on Statis-tics and Applied Probability. Chapman and Hall, New York, 1995.

23

Documents

Spectral Filtering for Trend Estimationscienze-como.uninsubria.it/mdonatelli/Publications/...Spectral Filtering for Trend Estimation1 Marco Donatelli, Alessandra Luati, Andrea Martinelli