System-Equation ADL Test for Threshold ... - Miami University · Miami University . 2013 . Working Paper # - 2013-05. System-Equation ADL Test For Threshold Cointegration With an

Department of Economics

Working Paper

System-Equation ADL Test for Threshold Cointegration

with an Application to the Term Structure of Interest Rates

Jing Li

Miami University

2013

Working Paper # - 2013-05

System-Equation ADL Test For Threshold Cointegration With an

Application to the Term Structure Of Interest Rates

Jing Li∗

Miami University

Abstract

This paper proposes a new system-equation test for threshold cointegration based

on a threshold vector autoregressive distributed lag (ADL) model. The new test can be

applied when the cointegrating vector is unknown and when weak exogeneity fails. The

asymptotic null distribution of the new test is derived, critical values are tabulated,

and finite-sample properties are examined. In particular, the new test is shown to have

good size, so the bootstrap is not required. The new test is illustrated using the long

term and short term interest rates. We show that the system-equation model can shed

light on both asymmetric adjustment speeds and asymmetric adjustment roles. The

latter is unavailable in the Enders-Siklos single-equation testing strategy.

Keywords:

Autoregressive Distributed Lag Model; Simultaneous Equation Model; Asymmetry;

Weak Exogeneity

JEL Classification: C22, C12, C13

∗Jing Li, Department of Economics, Miami University, Oxford, OH 45056, USA. Phone: 001.513.529.4393,Fax: 001.513.529.6992, Email: [email protected].

1

Introduction

Some economic theories imply that variables are related only within certain regime. For

instance, according to the modified law of one price, spatial price linkage exists only in the

regime where price differential exceeds transaction cost. Outside that regime the arbitrage

becomes non-profitable, so the price linkage breaks. Empirical findings are provided in

Giovanini (1988), Taylor and Peel (2000), Sephton (2003) and Jang et al. (2007), among

others. This type of asymmetric adjustment toward equilibrium is the key idea of the

threshold cointegration introduced by Balke and Fomby (1997). In practice the interest

is often on testing threshold cointegration, namely, the existence of long run relationship

with regime-dependent error-correcting speeds.

This paper makes contribution to the literature by proposing a new system-equation test

for the threshold cointegration based on the autoregressive distributed lag (ADL) model.

The new test improves existing tests as follows. Enders and Siklos (2001) (ES) first propose

an Engle-Granger type residual based single-equation test for the threshold cointegration.

The ES test can capture only asymmetric adjustment speeds. In contrast, the new ADL test

is based on a system-equation model, so can accommodate not only asymmetric speeds, but

also asymmetric roles. By asymmetric roles, we mean some variables have error-correcting

speeds significantly different from other variables. The asymmetric roles are illustrated in

the application of the new test.

Next Seo (2006) develops a system-equation test using a generalized error correction

model (ECM). The ECM test requires predetermined cointegrating vectors. But in practice

the cointegrating vector can be misspecified. To make it worse, many economic theories

are not numerically specific, resulting in unknown cointegrating vectors. That means the

application of the ECM test can be limited. The new ADL test is not as restrictive, and can

be applied even when the cointegrating vector is unknown. This improvement is attributed

to the fact that in the ADL model the lagged regressand, rather than the error correction

2

term, is used as the regressor. Using the lagged regressand has another advantage. The ADL

test is shown to be insensitive to a poorly-estimated or misspecified cointegrating vector.

The ADL test conducts a more efficient grid search for the threshold value than the

ECM test. Both tests use the error correction term as the threshold variable, which is

nonstationary under the null hypothesis of no cointegration. However, the ECM test searches

the unknown threshold parameter over a range of fixed values. Because the probability of

a nonstationary series being less than a fixed value approaches zero asymptotically, the

threshold parameter is absent in the limiting distribution of the ECM test. This implies

that the information obtained in the grid search is asymptotically unexploited. In contrast,

the ADL test treats a percentile of the empirical distribution of the error correction term

as the threshold parameter. As a result, the grid search of the ADL test is over a range of

percentiles, rather than a range of fixed values. By doing so, the searching result is fully

utilized regardless of the sample size. In the end the limiting distribution of the ADL test can

be expressed as functionals of the two-parameter Brownian motion; the threshold parameter

is one of the two parameters.

Another improvement over the ECM test is that the ADL test allows for a nonzero

deterministic component in the series. By adding the trend in the testing regression, the ADL

test is able to distinguish difference stationarity from trend stationarity. This generalization

is important given that many economic series are trending. Following Sims et al. (1990), we

develop the asymptotic theory based on the canonical form of the trend-augmented model.

A single-equation ADL test is considered in Li and Lee (2010). The single-equation test

is easy to compute. However, the validity of that test hinges on the assumption of weak

exogeneity, i.e., only one variable is error-correcting; others are not. See Boswijk (1994)

for more discussion about weak exogeneity. Weak exogeneity becomes questionable when

a system is of high dimension. By assuming away weak exogeneity the new test is less

restrictive than the single-equation ADL test.

3

Finally, the new test extends the linear ADL cointegration test of Banerjee et al. (1986).

The linear test assumes regime-invariant error-correcting speeds. Early works such as Pip-

penger and Goering (1993) have shown the linear test is prone to losing power in the presence

of regime switching. The power loss does not occur to the new test because it explicitly ac-

counts for asymmetric adjustment speeds.

Testing for the threshold cointegration is nonstandard because the threshold parameter

cannot be identified under the null hypothesis. Following Hansen (1996) we adopt the sup

Wald statistic to address the so called Davies’ problem, c.f., Davies (1977, 1987). The

limiting null distribution of the new test is derived as functionals of the two-parameter

Brownian motion. The new distribution extends Horvath and Watson (1995) to threshold

processes.

We conduct Monte Carlo experiments to compare the finite-sample performance. Re-

markably, the simulation shows that the size distortion of the new test is much smaller than

the ECM test. Thus the bootstrap is not required for the new test, a good message for

practitioners. The simulation also shows misspecification in the error correction term can

lead to deterioration in the power of the ECM test, but not the ADL test. We provide an ap-

plication of the new test, in which the relationship of U.S. short term and long term interest

rates is investigated. The focal point is the asymmetry in the process of error correction.

Throughout the paper 1( .) denotes the indicator function, | .| the Euclidean norm, [x]

the integer part of x, ⇒ the weak convergence with respect to the uniform metric on [0, 1]2.

All mathematical proofs are in the appendix.

Threshold Vector ADL Model

Consider a two-regime threshold vector autoregressive distributed lag model (TVADLM)

α(L)∆yt = (c11 + c12t+ δ1yt−1)11t−1 + (c21 + c22t+ δ2yt−1)12t−1 + et, (1)

4

where yt = (y1t, . . . , ynt)′ is an n × 1 data vector from a sample of size T, ∆yt is the first

difference, et is an i.i.d n × 1 vector of error terms, and δ1 and δ2 are n × n matrixes.

Intercepts and trends are included in (1), so the model can be applied to trending series. The

deterministic terms are regime-dependent in order to account for the possible discontinuity at

the threshold parameter. The short-run dynamics is represented by α(L) ≡ In−∑p

j=1 αjLj,

a p-th order matrix polynomial in the lag operator L. In practice choosing p can be guided

by information criteria of AIC and BIC.

Economic factors such as transaction cost and policy intervention can lead to threshold

behavior (regime-switching) in an economic system. Accordingly, two indicators are defined

as

11t−1 ≡ 1(zt−1 < τ), 12t−1 ≡ 1(zt−1 ≥ τ). (2)

In words, in regime one the value of the lagged threshold variable, zt−1, is less than the

threshold parameter τ, whereas greater than or equal to τ in regime two. Alternative specifi-

cations are possible. For example, zt−1 can be replaced with zt−d, where the delay parameter

d can be obtained by the grid search. A three-regime model can be specified with three

indicators: 11t−1 = 1(zt−1 < τ1), 12t−1 = 1(τ2 ≤ zt−1 ≤ τ2) and 13t−1 = 1(zt−1 > τ2). Testing

the number of regimes is an unsolved issue. This paper focuses on the two-regime model (1)

for ease of exposition. To develop the asymptotic theory, we assume

Assumption 1 (a) zt−1 has a marginal uniform distribution, zt−1 ∼ U(0, 1). (b) et is i.i.d,

Eet = 0, Eete′t = Ω, and E|et|4 < ∞. (c) All roots of |α(L)| = 0 are outside the unit circle.

Assumption 1-(c) is typically imposed in the literature. Assumption 1-(a) and (b) are made

so that the following weak convergence holds. Under Assumption 1-(a) and (b), as T → ∞,

T−1/2

[Tr]∑t=1

11t−1et ⇒ PW (r, τ), (3)

5

where W (r, τ) denotes the multivariate version of the two-parameter Brownian motion

(TPBM) introduced by Caner and Hansen (2001). When τ = 1, W (r, 1) becomes the

standard multivariate one-parameter Brownian motion. Matrix P is the Cholesky factor of

Ω, i.e., Ω = PP ′. Basically (3) is a simple extension of Theorem 1 of Caner and Hansen

(2001) to the multivariate case. Because the TPBM depends on the threshold parameter

τ, searching that parameter is relevant for our asymptotic theory, which is not the case for

the ECM test. This difference is due to the choice of the searching range for the threshold

parameter. To see this, let γ denote the n× 1 cointegrating vector and

xt−1 ≡ γ′yt−1 (4)

denote the lagged error correction term. The literature usually lets the regime-switching

be governed by xt−1 since it measures the deviation from the equilibrium. The ECM test

grid searches the threshold value over a range of fixed values, despite the fact that xt−1

is integrated of order one or nonstationary under the null hypothesis of no cointegration.

The outcome is undesirable: the searching result is lost asymptotically and the threshold

parameter vanishes in the limiting distribution, see Theorem 2 of Seo (2006). Furthermore,

severe size distortion is reported for the ECM test, and the bootstrap is recommended as a

remedy.

In light of Assumption 1-(a) the ADL test defines the threshold variable as

zt−1 ≡ cdf(xt−1) (5)

where cdf( .) denotes the empirical distribution function1. By the property of the empir-

ical distribution function, cdf(xt−1) ∼ U(0, 1), so Assumption 1-(a) holds. Since cdf is a

1The empirical distribution function is non-decreasing, and is a step function that jumps for 1/T at eachof the T data points.

6

monotonic transformation, it follows that

1(zt−1 < τ) = 1(cdf(xt−1) < τ) = 1(xt−1 < xt−1(τ)) (6)

where xt−1 is xt−1 sorted in an ascending order, and xt−1(τ) denotes the τth element of

xt−1. In words the threshold parameter τ in this case represents a percentile of the empirical

distribution of the error correction term. The searching range for τ is the parameter space

of percentiles, which is always restricted to lie between 0 and 1. Searching the percentile has

the benefit that the parameter τ always matters, no matter in finite samples or in the limit,

and no matter xt−1 is stationary or not. More discussion about using the percentile as the

threshold value can be found in Li and Lee (2010).

The link between model (1) and the error correction model becomes apparent if we rewrite

δ1 = β1γ′, δ2 = β2γ

′, (7)

where γ is n× 1 cointegrating vector, and β1 and β2 are n× 1 vectors of adjustment speeds

in the two regimes. Here we assume the number of cointegration relationships is at most one

so that γ has column rank of 1. The issue of testing q against q + 1 threshold cointegration

relationships is left to future research. Notice that γ has no subscript in (7). This is because

most economic theories imply regime-invariant long run relationship, which is quantified by

γ. It is β, the error-correcting speed, that varies across regimes in short run. There is also a

technical reason to assume constant γ. There would be two error correction terms with two

gammas. Ambiguity would arise since we are not sure about searching which one of them.

Seo (2006) implicitly imposes regime-invariant γ and one cointegration at maximum.

Factorization (7) generalizes the one used to derive the traditional linear error correction

model. By virtue of (4) and (7), model (1) can be rewritten as a two-regime threshold vector

7

error correction model (TVECM)

α(L)∆yt = (c11 + c12t+ β1xt−1)11t−1 + (c21 + c22t+ β2xt−1)12t−1 + et. (8)

By allowing for β1 = β2, model (8) extends the traditional linear error correction model. In

a similar fashion, model (1) extends the linear ADL model by allowing for δ1 = δ2.

The ECM test of Seo (2006) is concerned with the null hypothesis H0 : β1 = β2 = 0

using a testing regression slightly different from (8). First, the trend term is excluded, so

the ECM test cannot be applied to trending series. Second, the ECM test grid searches τ

over a range of fixed values.

Given that the error correction term xt−1 appears on the right side of (8), the ECM test

requires that the cointegrating vector γ be pre-determined. This limits the application of

the ECM test as γ is unknown in many cases. Some practitioners would apply the ECM test

after estimating γ, but rigorously speaking, the asymptotic theory of Seo (2006) becomes

invalid when γ is estimated. This situation is similar to the Enger-Granger test, which

follows the Dickey Fuller distribution if the cointegrating vector is known, but Phillips-

Ouliaris distribution if the cointegrating vector is estimated, see Phillips and Ouliaris (1990)

for details.

The asymptotic theory for the proposed ADL test is valid no matter γ is known or

estimated. This is partly because the regressor of model (1) involves yt−1, the lagged regres-

sand. The error correction term only has indirect effect on model (1) through the indicator.

Thanks to the indirect role, poor estimation or misspecification of the cointegrating vector

has minimal effect on the ADL test. In contrast, xt−1 appears directly on the right hand size

of (8). As a result, a misspecified cointegrating vector will adversely affect the ECM test

through both the indicator and the regressor. In this regard TVADLM (1) is more robust

to the misspecification than TVECM (8). The subsequent simulation will confirm that.

8

The ES test of Enders and Siklos (2001) is concerned with the null hypothesis H0 : β∗1 =

β∗2 = 0 based on the single regression

α∗(L)∆xt = β∗1xt−111t−1 + β∗

2xt−112t−1 + e∗t . (9)

Regression (9) essentially replaces the regressand ∆y1t (with normalization) in the first equa-

tion of (8) with ∆xt. To justify this replacement the ES test imposes the restriction that the

short-run dynamics of components of yt involves one common factor. Kremers et al. (1992)

call this common factor restriction, in the context of the Engle-Granger testing regression.

The ADL test is based on the system model (1), so is free of the common factor restriction.

It is instructive to highlight that model (1) can shed light on two types of asymmetry.

First, asymmetric adjustment speeds are allowed for by δ1 = δ2. Second, asymmetric adjust-

ment roles can be revealed by the fact that the coefficients of yt−111t−1 and yt−112t−1 in one

regression are different from other regressions. In the extreme case called weak exogeneity,

a variable plays no role of error correction when both coefficients are zero. The univariate

model (9) can illustrate only asymmetric adjustment speeds, not asymmetric adjustment

roles. We will revisit this issue in the application.

System-Equation ADL Test for Threshold Cointegration

This paper proposes a new test for the null hypothesis

H0 : δ1 = δ2 = 0 (10)

based on TVADLM (1), called the system-equation ADL test for threshold cointegration.

Because of (7), testing (10) is equivalent to testing β1 = β2 = 0 in (8). So the ADL test and

ECM test are closely related. Li and Lee (2010) show that testing (10) can be undertaken

9

in a single-equation framework if each of β1 and β2 contains only one non-zero element (i.e.,

only one variable is error-correcting). This condition is called weak exogeneity, which can

be restrictive for a high-dimensional system. The system-equation ADL test is more general

than the single-equation ADL test by assuming away weak exogeneity.

The alternative hypotheses can be

H11 : δ1 = 0, δ2 < 0; H2

1 : δ1 < 0, δ2 = 0; H31 : δ1 < 0, δ2 < 0. (11)

Strictly speaking H11 or H2

1 implies that yt is cointegrated only in one regime. When (10) is

rejected, we conclude that series are cointegrated, but not necessarily in both regimes. The

null hypothesis (10) can also be rejected if yt is linearly cointegrated, i.e., δ1 = δ2 = 0. In this

sense the threshold cointegration test extends the linear cointegration test. Pippenger and

Goering (1993) report that the linear cointegration test can suffer power loss in the presence

of asymmetric adjustment speeds. Here the alternative hypotheses H11 and H2

1 clear show

the asymmetry is allowed for by the threshold cointegration test.

Testing (10) is nonstandard because under (10) the indicators disappear and τ cannot

be identified. Davies (1977, 1987) first raises this issue, called Davies’ problem. We follow

Hansen (1996) and calculate the sup-Wald statistic2 as follows. First, we estimate the cointe-

grating vector γ by OLS if it is unknown, and compute the error correction term (4). Then,

for given τ we define the indicators (2) using (5), fit (1) by OLS, obtain the estimated coeffi-

cients δτ = (δ1(τ), δ2(τ)), the residual et(τ), and the variance matrix Ωτ = T−1∑

et(τ)et(τ)′.

We stack matrices by letting Yt = (yt−111t−1, yt−112t−1), Yτ = [Y1Y2 . . . YT ]′, e = [e1e2 . . . eT ]

′.

Define the annihilation matrix Q ≡ I − Z(Z ′Z)−1Z ′ where Z stacks (∆yt−1, . . . ,∆yt−p, dt).

Similar to the Dickey-Fuller unit root test, the distribution of the ADL test depends on the

specification of the deterministic term dt. There are three cases:

2This paper focuses on the sup test due to its simplicity. See Andrews and Ploberger (1994) for alterna-tives.

10

(Case I, no intercept or trend): c11 = c12 = c21 = c22 = 0 in (1), so dt is null;

(Case II, intercept only): c11 = 0, c21 = 0, c12 = c22 = 0 in (1), so dt = (11t−1, 12t−1);

(Case III, intercept and trend): c11 = 0, c21 = 0, c12 = 0, c22 = 0 in (1), so dt = (11t−1, t11t−1, 12t−1, t12t−1).

Let F (τ), F d(τ) and F t(τ) denote the Wald statistics for cases I, II and III, respectively.

They are given by

F (τ), F d(τ), F t(τ) ≡ [vec(δτ )]′[var(vec(δτ ))

−1][vec(δτ )] (12)

= [vec(δτ )]′[(Y ′

τQYτ )−1 ⊗ Ωτ

]−1

[vec(δτ )] (13)

= [vec(e′QYτ )]′[(Y ′

τQYτ )−1 ⊗ Ω−1

τ

][vec(e′QYτ )] (14)

= trace[Ω−1/2

τ (e′QYτ )(Y′τQYτ )

−1(Y ′τQe)Ω−1/2′

τ

], (15)

where Ω−1/2τ is the Cholesky factor of Ωτ , vec denotes the vec operator and ⊗ the Kronecker

product operator. Note that the definition of the Wald statistics is (12), but in practice we

calculate the Wald statistics using (13). The asymptotic theory is built upon (15). Finally,

we search τ over a range, and the sup Wald test is given by

supτ∈Θ

F (τ), supτ∈Θ

F d(τ), supτ∈Θ

F t(τ), (16)

where the searching range Θ ⊂ [0, 1] is a compact set. Symmetric sets such as Θ = [0.15, 0.85]

are recommended in the literature. Note that some observations are discarded in the grid

search in order to avoid divergent asymptotic distributions.

11

Asymptotic Theory

Let W (r, τ) denote the n-dimensional two-parameter Brownian motion on (r, τ) ∈ [0, 1]2.

For notational economy we write∫ 1

0as∫. By holding τ constant, the integration is over r

for the stochastic integrations such as∫W (r, 1)dW (r, τ)′. Following the literature, we make

an auxiliary assumption for cases I and II.

Assumption 2 Under the null hypothesis (10), yt is generated by (1) with c11 = c12 = c21 =

c22 = 0.

Assumption 2 states that the true process under the null is vector random walk without

drift. The limiting null distributions of the system-equation ADL tests (16) for cases I and

II are

Theorem 1 (Case I) Under (10) and Assumptions 1, 2, as T → ∞,

supτ∈Θ

F (τ) ⇒ supτ∈Θ

trace[J ′0J

−11 J0

], (17)

where

J0 =

∫W (r, 1)dW (r, τ)′∫

W (r, 1)dW (r, 1)′ −∫W (r, 1)dW (r, τ)′

,

J1 =

τ∫W (r, 1)W (r, 1)′dr 0

0 (1− τ)∫W (r, 1)W (r, 1)′dr

.

Theorem 2 (Case II) Under (10) and Assumptions 1, 2, as T → ∞,

supτ∈Θ

F d(τ) ⇒ supτ∈Θ

trace[Jd′

0 (Jd1 )

−1Jd0

], (18)

12

where

Jd0 = J0 −

∫W (r, 1)drW (1, τ)∫

W (r, 1)dr[W (1, 1)−W (1, τ)]

,

Jd1 = J1 −

τ∫W (r, 1)dr

∫W ′(r, 1)dr 0

0 (1− τ)∫W (r, 1)dr

∫W ′(r, 1)dr

.

There are some remarks. First of all, the distributions in (17) and (18) are in terms of

the TPBM, which includes the threshold parameter τ. In contrast Theorem 2 of Seo (2006)

shows that τ is absent in the limiting distribution of the ECM test. Second, the new dis-

tributions depend on n, the dimensional parameter, and Θ, the searching range. Finally,

the distributions are free of nuisance parameters such as the Cholesky factor P and serial

correlation characterized by α(L). Therefore we can tabulate in Table 1 the critical values

of the limiting distributions for various n and Θ. Those critical values are computed as the

empirical quantiles from 5, 000 independent draws from the simulations for the asymptotic

formula in Theorems 1 and 2.

Table 1 shows that the critical value increases as the searching range Θ gets wider. This

is indicative of the divergent behavior of the sup Wald test. The critical value also increases

as the dimensional parameter n rises. Finally, the critical values of supτ∈Θ F (τ) (reported

under column w/o constant in Table 1) are less than those of supτ∈Θ F d(τ) (reported under

column w/ constant). As expected, the intercept term shifts the limiting distribution.

For case III we make the auxiliary assumption that

Assumption 3 Under the null hypothesis (10), yt is generated by (1) with c12 = c22 =

0, c11 = c21 = c0.

Assumption 3 states that under the null hypothesis yt is vector random walk with drift

component of c0. In this case we need to derive the canonical form introduced by Sims et al.

13

(1990) for (1). To see why, notice that under Assumption 3 α(L)∆yt = c0+et. So yt consists

of a stochastic trend and a deterministic trend, and the latter is correlated with the trend

term in the testing regression. Therefore it is necessary to avoid collinearity by removing

the deterministic trend from yt. Toward that end we consider the canonical form given by

α(L)∆yt = (c∗11 + c∗12t+ δ∗1ζt−1)11t−1 + (c∗21 + c∗22t+ δ∗2ζt−1)12t−1 + et. (19)

where ζt−1 = yt−1 − c0(t− 1), δ∗1 = δ1, δ∗2 = δ2, c

∗11 = c11 − c0, c

∗12 = c12 + c0, c

∗21 = c21 − c0,

and c∗22 = c22 + c0. Now the new regressor ζt−1 in (19) contains only the stochastic trend.

The hypothetical model (19) is just the algebra rearrangement of (1), so testing δ1 = δ2 = 0

in (1) is equivalent to testing δ∗1 = δ∗2 = 0 in (19). The computation of the ADL test, denoted

by supτ∈Θ F t(τ) in this case, is still based on (1). The limiting distribution of supτ∈Θ F t(τ)

is derived based on (19), and is given in the following theorem.

Theorem 3 (Case III) Under (10) and Assumptions 1, 3, as T → ∞,

supτ∈Θ

F t(τ) ⇒ supτ∈Θ

trace[J t′

0 (Jt1)

−1Jdt

], (20)

where

J t0 = J0 − C∗

21C−122 C

∗23 − C∗

31C−132 C

∗33,

J t1 = J1 − C∗

21C−122 C

∗′21 − C∗

31C−132 C

∗′31,

C21 =

τ∫W (r, 1)dr τ

∫rW (r, 1)dr

0 0

, C22 =

τ τ∫rdr

τ∫rdr τ

∫r2dr

, C23 =

W (1, τ)∫rdW (r, τ)′

C31 =

0 0

(1− τ)∫W (r, 1)dr (1− τ)

∫rW (r, 1)dr

, C32 =

1− τ (1− τ)∫rdr

(1− τ)∫rdr (1− τ)

∫r2dr

,

14

C33 =

[W (1, 1)−W (1, τ)][∫rdW (r, 1)′ −

∫rdW (r, τ)′

] .

Note that the distribution does not depend on c0, the drift term. The critical values of

the distribution are reported under the column w/trend in Table 1. In practice we can use

supτ∈Θ F t(τ) test to distinguish difference-stationary threshold series from trend-stationary

threshold series. The ECM test cannot do this since it excludes the trend term.

Monte Carlo Experiments

This section carries out the size and power comparisons in finite samples for the proposed

system-equation ADL test, the ECM of Seo (2006) and the ES test of Enders and Siklos

(2001). We first compare the rejection frequency under the null hypothesis (size) at the 5%

nominal level. For tractability, a bivariate data generating process (DGP) is considered:

∆y1t = c1∆y1t−1 + e1t (21)

∆y2t = c2∆y2t−1 + e2t (22)

xt−1 = y1t−1 − γy2t−1 (23)

xt−1 = y1t−1 − γy2t−1 (24) e1t

e2t

∼ iidn(0,Ω), Ω =

1 d1

d1 1

, (25)

where γ = 0.1, and γ is the OLS estimate. The short-run dynamics is controlled by c1 and

c2, and the correlation between the error terms is determined by d1. Their default values are

c1 = 0.1, c2 = 0.5, and d1 = 0.7. When running the simulation, we let one parameter vary

while keeping other parameters at their default values. The null hypothesis (10) is imposed

since neither y1t nor y2t is error-correcting, i.e., the error correction term (23) is absent in

(21) and (22). The number of iteration is 5000, and the sample size is 100. The initial

15

values of y1t and y2t are set as random numbers.

We drop the intercept and include one lag in the testing regressions. The corresponding

critical values at the 5% level are 30.2 for the new ADL test, 7.08 for the ES test (denoted

by Φ∗ in Table 5 of Enders and Siklos (2001)), and 10.968 for the ECM test (denoted by

supW in Table 1 of Seo (2006)). The ADL and ES tests use the estimated error correction

term (24), while the ECM test uses the pre-determined error correction term (23). Panels

A, B, C of Figure 1 plot the size against varying c1, c2 and d1, respectively.

We can summarize Figure 1 as follows. Among the three tests, the size of the ADL

test (marked by circle) is closest to the nominal level 0.05, and is largely invariant to c1,

c2 and d1. This finding verifies that the limiting distribution in Theorem 1 is indeed free of

those nuisance parameters. It also implies that the bootstrap is not needed for the ADL

test. In contrast, a size as large as 0.4 is found for the ECM test (marked by square). Size

distortion of this magnitude is also reported in Table 2 of Seo (2006). Because of the severe

size distortion, the ECM test (marked by triangle) has to be used with the bootstrap. The

size of the ES test is slightly under the nominal level.

Next we compare the rejection frequency under the alternative hypothesis (power). In

light of the size distortion, we focus on the size-adjusted power by using the 5% bootstrap

critical value obtained under the null hypothesis. The new DGP is:

∆y1t = −0.1xt−11(xt−1 < 0) + k1xt−11(xt−1 ≥ 0) + 0.1∆y1t−1 + e1t (26)

∆y2t = 0.5∆y2t−1 + e2t (27)

xt−1 = y1t−1 − (γ + f)y2t−1 (28)

(23) and (25). Now the error correction term (23) enters (26), so y1t is error-correcting and

the null hypothesis is violated. The parameter k1 determines the error-correcting speed in the

second regime. When k1 = −0.1, the system is cointegrated linearly, i.e., the error-correcting

16

speeds become constant.

Because the ADL and ES tests can use either predetermined cointegrating vector or

estimated cointegrating vector, and because the power depends on which cointegrating vector

is used, we need to be careful to make sure the tests are compared on equal footing. So we

denote the ADL and ES tests using the predetermined error correction term (28) by ADLP

(marked by diamond) and ESP (marked by star), and the tests using the estimated error

correction term (24) by ADL and ES. By allowing for f = 0 in (28), the predetermined

error correction term (28) can differ from the true error correction term (23). Then we can

evaluate the effect of misspecified cointegrating vector on the power. Figures 2 plots the

power against k1. In Panel A, f = 0, so the error correction term is correctly pre-specified.

Misspecification is introduced in panel B with f = 0.05, and in Panel C with f = 0.1.

There are two main findings. First, the powers of all tests are minimized when k1 = −0.1.

The powers start to rise as k1 moves toward −0.8. This result makes sense because k1

measures both the degree of asymmetry and strength of error-correction. Second, the powers

of ECM and ESP are sensitive to the misspecification in the cointegrating vector. For

instance, the ECM test has the greatest power in Panel A; but its power becomes the

second lowest in Panel C. In contrast, the power of the ADLP test is almost invariant to

the misspecification. This difference is expected since the misspecified error correction term

has marginal affect on the ADL test only through the indicator. But the same misspecified

error correction term is used as regressor in the ECM and ESP tests, which amplifies the

adversary effect of misspecification.

We need to interpret the power performance of the ECM test with caution. Remember

due to the nature of the Monte Carlo experiment the cointegrating vector (1,−γ) happens

to be known. That is why the ECM test can be applied here. Along with the favorable

assumption that there is no misspecification, then the ECM test is shown to dominate other

tests. In practice the condition of a correctly predetermined cointegrating vector is subject

17

to violation. On the other hand, the power of the proposed ADL test is less than the ECM

test in Panel A. But in panel C their ranking reverses. Overall, the stability of the power

indicates that the ADL test can be a reliable alternative to the ECM test. In terms of

power, the ES test consistently outperforms the ADL test. However, the ES test has the

disadvantage that it captures fewer types of asymmetry than the ADL test, as illustrated by

the next application.

An Application

This section investigates the relationship between U.S. 10-year treasury constant maturity

rate (10ybond, y1t) and the effective federal funds rate (Fedfunds, y2t). The goal is to provide

the latest empirical finding about the asymmetry in the term structure of interest rates. The

monthly data from January 1962 to August 2013 are downloaded from FRED Economic

Data at http://research.stlouisfed.org/fred2/. The sample size is 620.

Panel A of Figure 3 presents the time series plot. Both interest rates appear highly persis-

tent with long swing. With the lag selected by AIC, the augmented Dicky-Fuller t statistics

are -1.97 for Fedfunds and -1.30 for 10ybond. Therefore both series are nonstationary. Panel

A also shows the two series tend to move together over time. Such co-movement is predicted

by economic theories such as the expectation hypothesis. Empirically we are interested in

the long run relationship:

y1t = γ0 + γ1y2t + xt, (29)

where xt is the error correction term. Since γ0 and γ1 are both unknown, this is an example

where the economic theory does not quantify the cointegrating vector. The OLS estimation

of (29) is

xt = y1t − 2.81− 0.68y2t.

So on average the long term rate 10ybond exceeds the short term rate Fedfunds by 2.81

18

when the short term rate equals zero. Panel B of Figure 3 plots the error correction term

xt. The fact that xt shows less persistence or more choppiness than y1t and y2t is suggestive

of cointegration. Actually the Engle-Granger linear cointegration test applied to xt is -3.41,

rejecting the null hypothesis of no cointegration at the 5% level.

The Engle-Granger test assumes constant adjustment speeds. However, intuition tells

us the adjustment speed may be non-constant. More explicitly, the speed may depend on

whether xt, which loosely speaking measures the gap between the two series, is big or small.

Big gap (in absolute value) supposedly leads to fast adjustment.

The key question is, which value of xt will trigger the fast adjustment? Or in other

words, how to define “big” gap? In order to answer that, we grid search the threshold value

τ by minimizing the determinant3 of Ω = T−1∑

ete′t, where et = (e1t, e2t) denotes the OLS

residual of the two-regime threshold vector ADL model:

α1(L)∆y1t = (c1,11 + δ11yt−1)11t−1 + (c1,21 + δ12yt−1)12t−1 + e1t (30)

α2(L)∆y2t = (c2,11 + δ21yt−1)11t−1 + (c2,21 + δ22yt−1)12t−1 + e2t, (31)

where the deterministic term include the intercept. Trend is excluded since Figure 3 shows

no trending pattern. We follow the principle of parsimony and start with one lag (let p = 1)

in α1(L) and α2(L). The model with one lag seems adequate as the Ljung-Box Q test with

one lag finds no serial correlation in e1t and e2t.

The threshold value is estimated as τ = −1.25, and a dash line corresponding to that

value is drawn in Panel B of Figure 3. It seems that below the dash line the error series xt

moves more quickly than above the line. This asymmetry is consistent with our prior belief,

and signifies regime-switching.

Table 2 reports the estimation results for (30) and (31). There are two main findings.

3Minimizing the trace yields similar results.

19

First, asymmetric adjustment speeds are confirmed by the fact that the absolute values of the

coefficients in lower regime (with indicator 11t−1 = 1(zt−1 < −1.25)) are different from those

in upper regime (with indicator 12t−1 = 1(zt−1 ≥ −1.25)). For instance, in equation (31),

the estimated coefficient of y1t−1 is 0.22 and significant at the 10% level in lower regime, but

-0.03 and insignificant in upper regime. To interpret this result, notice that in lower regime

the error correction term is more negative than -1.25. That means the yield curve is likely to

be inverted in lower regime. An inverted yield curve is unusual, so the market reacts quickly.

For instance, in Panel A of Figure 3 we see the Fedfunds tops 10ybond and the yield curve

becomes inverted after the 200-th observation. During that period (from September 1978 to

May 1980) the adjustment speed of Fedfunds is much faster than periods with normal yield

curves.

The second main finding is, the absolute values of some coefficients in (31) are significantly

greater than (30). Take the coefficient for the regressor y2t−111t−1. It is -0.23 and significant

in (31), but -0.03 and insignificant in (30). Recall that the regressands are differenced

Fedfunds in (31) and differenced 10ybond in (30). Hence the second finding indicates that

the two interest rates play asymmetric roles in the adjustment process. In particular, the

short term rate Fedfunds has two big coefficients (0.22 and -0.23), and therefore plays bigger

role than the long term rate 10ybond4. This finding is intuitive. Fedfunds plays bigger

role just because it is a monetary policy instrument frequently manipulated by the Federal

Reserve.

Note that the two main findings are unavailable in the Engle-Granger linear cointegration

model because it pre-excludes asymmetry. The single-equation model used by Enders and

Siklos (2001) can detect only the asymmetric adjustment speeds, but not the asymmetric

roles. This application provides an example showing the system-equation threshold model

is inherently more informative than the single-equation model.

4In (30) the coefficients of y1t−112t−1 and y2t−112t−1 are statistically significant, but close to zero.

20

Finally we test the null hypothesis of no threshold cointegration, i.e., H0 : δ11 = δ12 =

δ21 = δ22 = 0 in (30) and (31), using the proposed system-equation ADL test. The ECM test

of Seo (2006) is inapplicable in this case because γ0 and γ1 in (29) are not pre-determined.

The single-equation ADL test of Li and Lee (2010) is also invalid since both y1t and y2t are

error-correcting and weak exogeneity fails. We let the searching range be Θ = [0.15, 0.85]. By

including the intercept (case II), the ADL test supτ∈Θ F d(τ) equals 128.04, greater than the

5% critical value 42.5. The Enders-Siklos test is 18.62, rejecting no threshold cointegration

as well.

We can summarize the application as follows. First, the long term and short term interest

rates are cointegrated. Second, there is evidence that fast adjustment occurs when the yield

curve is inverted. Finally, the short term rate plays bigger adjustment role than the long

term rate.

Conclusion

This paper develops a new system-equation ADL test for the threshold cointegration. Using

the new test we can resolve some problems found in the existing tests. In particular, the

new test can be used when the cointegrating vector is unknown and when weak exogeneity

fails. The new test can be applied to trending series, and the grid-searching information is

efficiently utilized by the new test irrespective of sample sizes.

The limiting null distribution of the new test can be tabulated since no nuisance param-

eters are involved. Monte Carlo simulations show that the size distortion of the new test

is negligible, so the bootstrap is not needed. The power of the new test is comparable to

the existing tests. We provide an application of the new test, in which the long run equilib-

rium between long term and short term interest rates is found and twofold asymmetries are

detected. Noticeably, statistical evidence is provided for the asymmetric adjustment roles;

such evidence is eluded in the single-equation model.

21

Appendix: Mathematical Proof

The asymptotic theory is based on the following lemma. Suppose et is an i.i.d (n×1) sequence

with mean zero, finite fourth moments, and Eete′t = Ω = PP ′ where P is the Cholesky

factor. Let ut = D(L)et denote a stationary linear vector sequence with∑∞

s=0 s|Dsij| <

∞ for each i, j = 1, . . . , n. Define ξt =∑t

j=1 uj, Λ = (D0 + D1 + . . .)P = D(1)P, and

Λ1 =∑∞

i=1E(1(zt−1 < τ)1(zt−1−i < τ)utu′t−i). Let W (r, τ) denote the n-dimensional two-

parameter Brownian motion; W (r, 1) the n-dimensional one-parameter Brownian motion in

r.

Lemma 1

a : T−1/2

[Tr]∑t=1

1(zt−1 < τ)et ⇒ PW (r, τ)

b : T−1

T∑t=1

ξt−11(zt−1 < τ)e′t ⇒ Λ

[∫W (r, 1)dW (r, τ)′

]P ′

c : T−2

T∑t=1

ξt−1ξ′t−11(zt−1 < τ) ⇒ τΛ

[∫W (r, 1)W (r, 1)′dr

]Λ′

Proof of Lemma 1-a: Let vt be an i.i.d (n× 1) vector with mean zero and unity variance

Evtv′t = I. Notice that 1(zt−1 < τ)vit, i = (1, . . . , n), is a strictly stationary and ergodic

martingale difference process with variance E(1(zt−1 < τ)vit)2 = τ. For any (r, τ), the

martingale difference central limit theorem implies that

T−1/2

[Tr]∑t=1

1(zt−1 < τ)vit →d N(0, rτI),

and the asymptotic covariance kernel is

E

T−1/2

[Tr1]∑t=1

1(zt−1 < τ1)vit

T−1/2

[Tr2]∑t=1

1(zt−1 < τ2)vit

= (r1 ∧ r2)(τ1 ∧ τ2).

22

The proof of the stochastic equicontinuity for each component of T−1/2∑[Tr]

t=1 1(zt−1 < τ)vit

is lengthy, and can be found in Caner and Hansen (2001). Because the sequences of marginal

probability measures are tight, we deduce that the univariate stochastic equicontinuity car-

ries over to the vector, see Lemma A.3 of Phillips and Durlauf (1986). The final result is

deduced after applying the Cramer-Wold device and noticing that et = Pvt.

Proof of Lemma 1-b: We apply the Beveridge-Nelson decomposition to ut and obtain

ut = D(1)et + et−1 − et where et = D(L)et, D(L) =∑∞

j=0 djLj and dj =

∑∞s=j+1 ds. The

stated result is obtained by applying Theorem 2.2 in Kurtz and Protter (1991) and Lemma

1-a above.

Proof of Lemma 1-c: This is a trivial extension of result 3 of Theorem 3 of Caner and

Hansen (2001) to the multivariate case.

Proof of Theorem 1: We prove the result when the testing regression is a TVADLM

with one lag:

∆yt = δ1yt−111t−1 + δ2yt−112t−1 + α1∆yt−1 + et.

The proof can be easily extended to a general TVADLM with p lags. Under the null

hypothesis δ1 = δ2 = 0, it follows that α(L)∆yt = et, α(L) = I − α1L. Define ut =

D(L)et, D(L) = α(L)−1, then yt−1 = ξt−1 in Lemma 1. Let yt−1 = (y′t−111t−1, y′t−112t−1)

′,

wt−1 = (y′t−1,∆y′t−1)′, δ = (δ1, δ2), and θ = (δ, α1). Consider the scaling matrix

Υ = diag(TIn, T1/2In).

23

We have

Υ(θ − θ

)=

[Υ−1

(T∑t=1

wt−1w′t−1

)Υ−1

]−1 [Υ−1

T∑t=1

wt−1et

],

where

Υ−1

(T∑t=1

wt−1w′t−1

)Υ−1 ⇒ B =

B11 B12

B′12 B22

,

B11 =

τΛ[∫

W (r, 1)W (r, 1)′dr]Λ′ 0

0 (1− τ)Λ[∫

W (r, 1)W (r, 1)′dr]Λ′

,

B12 = op(1),

B22 = E∆yt−1∆y′t−1,

and

Υ−1

T∑t=1

wt−1et ⇒ C =

C1

C2

,

C1 =

Λ[∫

W (r, 1)dW (r, τ)′]P ′

Λ[∫

W (r, 1)dW (r, 1)′ −∫W (r, 1)dW (r, τ)′

]P ′

C2 = Op(1).

In order to obtain B11 recall that 11t−1 = 1(zt−1 < τ), 12t−1 = 1 − 1(zt−1 < τ), and

11t−112t−1 = 0. The off-diagonal elements of B11 are zeros because of the orthogonality of

11t−1 and 12t−1. The upper-left term of B11 is due to Lemma 1-c; the lower-right term is

due to the facts that T−2∑T

t=1 yt−1y′t−1 ⇒ Λ

[∫W (r, 1)W (r, 1)′dr

]Λ′, cf. Lemma 3.1.(b) of

Phillips and Durlauf (1986), and

T−2

T∑t=1

yt−1y′t−112t−1 = T−2

T∑t=1

yt−1y′t−1(1−1(zt−1 < τ)) ⇒ (1−τ)Λ

[∫W (r, 1)W (r, 1)′dr

]Λ′.

24

B12 = op(1) since T−3/2∑T

t=1 yt−1∆y′t−1 = T−1/2T−1∑T

t=1 yt−1∆y′t−1 = T−1/2Op(1) = op(1).

B22 is due to the Weak Law of Large Number.

We obtain C1 by Lemma 1-b, and because T−1∑T

t=1 yt−1e′t ⇒ Λ

[∫W (r, 1)dW (r, 1)′

]P ′,

cf. Proposition 18.1.(f) of Hamilton (1994), and

T−1

T∑t=1

yt−112t−1e′t =

T−2

T∑t=1

yt−1(1− 1(zt−1 < τ))e′t ⇒ Λ

[∫W (r, 1)dW (r, 1)′ −

∫W (r, 1)dW (r, τ)′

]P ′.

C2 is due to the Central Limit Theorem.

Because Υ−1(∑T

t=1wt−1w′t−1

)Υ−1 is asymptotically block-diagonal, it follows that

Υ(θ − θ

)⇒

B−111 C1

B−122 C2

.

In particular, B−111 C1 indicates that

TIn(δ1 − δ1) ⇒ τ−1Λ′−1

[∫W (r, 1)W (r, 1)′dr

]−1 [∫W (r, 1)dW (r, τ)′

]P ′,

and

TIn(δ2 − δ2) ⇒

(1− τ)−1Λ′−1

[∫W (r, 1)W (r, 1)′dr

]−1 [∫W (r, 1)dW (r, 1)′ −

∫W (r, 1)dW (r, τ)′

]P ′.

This implies the super-consistency of δ, and the consistency of Ωτ . For given τ, let Yτ and

e be the matrices stacking (yt−111t−1, yt−112t−1) and et, respectively. Define Q as projection

25

onto the orthogonal space of the lagged term ∆yt−1. Write the OLS estimates for (δ1, δ2) as

δτ = (δ1, δ2). Then the Wald statistic for H0 : δ1 = δ2 = 0 is

F (τ) = vec(δτ )]′[(Y ′

τQYτ )−1 ⊗ Ωτ

]−1

[vec(δτ )]

= trace[Ω−1/2


−1(Y ′τQe)Ω−1/2′

τ

].

We can show that

T−1Y ′τQe = T−1Σyt−1e

′t − T−1/2

(T−1Σyt−1∆y′t−1

) (T−1Σ∆yt−1∆y′t−1

)−1 (T−1/2Σ∆yt−1e

′t

)= T−1Σyt−1e

′t + op(1) ⇒ C1,

and

T−2Y ′τQYτ = T−2Σyt−1y

′t−1 − T−1

(T−1Σyt−1∆y′t−1

) (T−1Σ∆yt−1∆y′t−1

)−1 (T−1Σ∆yt−1y

′t−1

)= T−2Σyt−1y

′t−1 + op(1) ⇒ B11

Combining with Ω−1/2τ = P−1 + op(1) we can show

F (τ) = trace[Ω−1/2


−1(Y ′τQe)Ω−1/2′

τ

]= trace

[Ω−1/2

τ C ′1B

−111 C1Ω

−1/2′

τ

]+ op(1)

⇒ trace[J ′0J

−11 J0

],

26

where

J0 =

∫W (r, 1)dW (r, τ)′∫

W (r, 1)dW (r, 1)′ −∫W (r, 1)dW (r, τ)′

J1 =

τ[∫

W (r, 1)W (r, 1)′dr]

0

0 (1− τ)[∫

W (r, 1)W (r, 1)′dr] .

The remaining steps for the convergence of supF follow from the continuous mapping the-

orem. End of Proof.

Proof of Theorem 2: Only a sketch of the proof is provided since it is similar to that

of Theorem 1. The testing regression is a TVADLM with one lag and regime-dependent

intercept terms:

∆yt = (c11 + δ1yt−1)11t−1 + (c21 + δ2yt−1)12t−1 + α1∆yt−1 + et.

Under the null hypothesis δ1 = δ2 = 0 and Assumption 3 c11 = c21 = 0 it follows that yt−1 =

ξt−1 in Lemma 1. For given τ, let Yτ and e be the matrices stacking (yt−111t−1, yt−112t−1) and

et, respectively. Define Qd as projection onto the orthogonal space of (11t−1, 12t−1,∆yt−1).

Because of the orthogonality between 11t−1 and 12t−1, and E∆yt−1 = 0, Qd = I− (Qd1+Qd

2+

Qd3), where Q

d1 is projection onto the space of 11t−1, Q

d2 is projection onto the space of 12t−1,

27

and Qd3 is projection onto the space of ∆yt−1. It follows that

T−1Y ′τQ

de = T−1Y ′τ e− T−1Y ′

τQd1e− T−1Y ′

τQd2e− T−1Y ′

τQd3e

T−1Y ′τ e ⇒ C1

T−1Y ′τQ

d1e ⇒ τ−1

τΛ∫W (r, 1)dr

0

W (1, τ)P ′

T−1Y ′τQ

d2e ⇒ (1− τ)−1

0

(1− τ)Λ∫W (r, 1)dr

[W (1, 1)−W (1, τ)]P ′

T−1Y ′τQ

d3e = op(1).

To summarize, we have

T−1Y ′τQ

de = C1 − Λ

∫W (r, 1)drW (1, τ)∫

W (r, 1)dr[W (1, 1)−W (1, τ)]

P ′ + op(1).

Similarly we can show that

T−2Y ′τQ

dYτ = B11−Λ

τ∫W (r, 1)dr

∫W ′(r, 1)dr 0

0 (1− τ)∫W (r, 1)dr

∫W ′(r, 1)dr

Λ′+op(1).

Finally, it follows that

F d(τ) = trace[Ω−1/2

τ (e′QdYτ )(Y′τQ

dYτ )−1(Y ′

τQde)Ω−1/2′

τ

]⇒ trace

[Jd′

0 (Jd1 )

−1Jd0

],

28

where

Jd0 = J0 −

∫W (r, 1)drW (1, τ)∫

W (r, 1)dr[W (1, 1)−W (1, τ)]

Jd1 = J1 −

τ∫W (r, 1)dr

∫W ′(r, 1)dr 0

0 (1− τ)∫W (r, 1)dr

∫W ′(r, 1)dr

.

End of Proof.

Proof of Theorem 3: Consider the testing regression given by

∆yt = (c11 + c12t+ δ1yt−1)11t−1 + (c21 + c22t+ δ2yt−1)12t−1 + et.

For easy of exposition, the above regression excludes the lagged term ∆yt−1 since it is asymp-

totically irrelevant, as shown by the proofs of Theorems 1 and 2. Under the null hypothe-

sis (10) and Assumption 3, yt = y0 + c0t + ξt where ξt =∑t

j=1 uj, ut = α(L)−1et. Because

yt is collinear with the time trend, we consider the hypothetical regression in the canonical

form

∆yt = (c∗11 + c∗12t+ δ∗1ζt−1)11t−1 + (c∗21 + c∗22t+ δ∗2ζt−1)12t−1 + et,

where ζt−1 = yt−1 − c0(t− 1), δ∗1 = δ1, δ∗2 = δ2, c

∗11 = c11 − c0, c

∗21 = c21 − c0, c

∗12 = c12 + c0,

c∗22 = c22 + c0. For given τ, stacking (ζt−111t−1, ζt−112t−1) as Xτ , (11t−1, t11t−1) as M1, and

(12t−1, t12t−1) as M2. Because 11t−112t−1 = 0, ∀t, the projection onto the orthogonal space of

(11t−1, t11t−1, 12t−1, t12t−1), denoted by Qt, is

Qt = I −M1(M′1M1)

−1M ′1 −M2(M

′2M2)

−1M ′2.

29

We can show that

T−1X ′τQ

te = T−1X ′τ [I −M1(M

′1M1)

−1M ′1 −M2(M

′2M2)

−1M ′2]e,

T−1X ′τe ⇒ C1,

T−1X ′τM1(M

′1M1)

−1M ′1e ⇒ C21C

−122 C23,

T−1X ′τM2(M

′2M2)

−1M ′2e ⇒ C31C

−132 C33,

where

C21 =

τΛ∫W (r, 1)dr τΛ

∫rW (r, 1)dr

0 0

≡ ΛC∗21,

C22 =

τ τ∫rdr

τ∫rdr τ

∫r2dr

,

C23 =

W (1, τ)P ′∫rdW (r, τ)′P ′

≡ C∗23P

′.

C31 =

0 0

(1− τ)Λ∫W (r, 1)dr (1− τ)Λ

∫rW (r, 1)dr

≡ ΛC∗31,

C32 =

1− τ (1− τ)∫rdr

(1− τ)∫rdr (1− τ)

∫r2dr

,

C33 =

[W (1, 1)−W (1, τ)]P ′[∫rdW (r, 1)′ −

∫rdW (r, τ)′

]P ′

≡ C∗33P

′.

To summarize,

T−1X ′τQ

te ⇒ C1 − C21C−122 C23 − C31C

−132 C33.

30

In a similar fashion we can show

T−2X ′τQ

tXτ ⇒ B11 − C21C−122 C

′21 − C31C

−132 C

′31.

Finally,

F t(τ) = trace[Ω−1/2

τ (e′QtXτ )(X′τQ

tXτ )−1(X ′

τQte)Ω−1/2′

τ

]⇒ trace

[J t′

0 (Jt1)

−1Jdt

],

where

J t0 = J0 − C∗

21C−122 C

∗23 − C∗

31C−132 C

∗33

J t1 = J1 − C∗

21C−122 C

∗′21 − C∗

31C−132 C

∗′31.

End of Proof.

31

References

Andrews, D.W.K., and Ploberger, W. (1994), “Optimal tests when a nuisance parameter is

present only under the alternative,” Econometrica, 62, 1383–1414.

Balke, N., and Fomby, T. (1997), “Threshold cointegration,” International Economic Review,

38, 627–645.

Banerjee, A., Dolado, J.J., Hendry, D., and Smith, G.W. (1986), “Exploring equilibrium

relationships in econometrics through static models: Some monte carlo evidence,” Oxford

Bulletin of Economics and Statistics, 48, 253–277.

Boswijk, H. P. (1994), “Testing for an unstable root in conditional and structural error

correction models,” Journal of Econometrics, 63, 37–60.

Caner, M., and Hansen, B. E. (2001), “Threshold autoregression with a unit root,” Econo-

metrica, 69, 1555–1596.

Davies, R. B. (1977), “Hypothesis testing when a nuisance parameter is present only under

the alternative,” Biometrika, 64, 247–254.

(1987), “Hypothesis testing when a nuisance parameter is only identified under the alter-

native,” Biometrika, 74, 33–43.

Enders, W., and Siklos, P. L. (2001), “Cointegration and threshold adjustment,” Journal of

Business & Economic Statistics, 19, 166–176.

Giovanini, A. (1988), “Exchange rates and traded goods prices,” Journal of International

Economics, 24, 45–68.

Hamilton, J. D. (1994), Time Series Analysis, Princeton: Princeton University Press.

32

Hansen, B. E. (1996), “Inference when a nuisance parameter is not identified under the null

hypothesis,” Econometrica, 64, 413–430.

Horvath, M. T., and Watson, M. W. (1995), “Testing for cointegration when some of the

cointegrating vectors are prespecified,” Econometric Theory, 11, 984–1014.

Jang, B-G., Koo, H. K., Liu, H., and Loewenstein, M. (2007), “Liquidity premia and trans-

action costs,” Journal of Finance, 62, 2329–2366.

Kremers, J. J. M., Ericsson, N. R., and Dolado, J. J. (1992), “The power of cointegration

tests,” Oxford Bulletin of Economics and Statistics, 54, 325–348.

Kurtz, T., and Protter, P. (1991), “Weak limit theorems for stochastic integrals and stochas-

tic differential equations,” Annals of Probability, 19, 1035–1070.

Li, J., and Lee, J. (2010), “ADL tests for threshold cointegration,” Journal of Time Series

Analysis, 31, 241–254.

Phillips, P. C. B., and Durlauf, S. (1986), “Multiple time series regression with integrated

processes,” Review of Economic Studies, 53, 473–496.

Phillips, P. C. B., and Ouliaris, S. (1990), “Asymptotic properties of residual based tests for

cointegration,” Econometrica, 58, 165–193.

Pippenger, M. K., and Goering, G. E. (1993), “A note on the empirical power of unit root

tests under threshold processes,” Oxford Bulletion of Economics and Statistics, 55, 473–

481.

Seo, M. (2006), “Bootstrap testing for the null of no cointegration in a threshold vector

error,” Journal of Econometrics, 134, 129–150.

33

Sephton, P. S. (2003), “Spatial market arbitrage and threshold cointegration,” American

Journal of Agricultural Economics, 85, 1041—1046.

Sims, C. A., Stock, J. H., and Watson, M. W. (1990), “Inference in linear time series models

with some unit roots,” Econometrica, 58, 113–144.

Taylor, M. P., and Peel, D. A. (2000), “Nonlinear adjustment, long-run equilibrium and

exchange rate fundamentals,” Journal of International Money and Finance, 19, 33—53.

34

Table 1: Asymptotic Critical Values

Φ = [0.10, 0.90] w/o constant w/ constant w/ trend1% 5% 10% 1% 5% 10% 1% 5% 10%

n = 2 35.0 30.6 28.3 49.0 43.7 40.6 62.4 54.8 51.5n = 3 61.4 55.3 52.1 78.2 71.7 67.9 95.1 87.3 83.5n = 4 95.5 88.1 83.9 114.8 107.5 103.2 135.5 126.9 122.5n = 5 138.0 128.8 123.8 160.2 151.4 146.8 184.5 175.0 169.7


n = 2 35.5 30.2 27.9 48.2 42.5 39.9 60.0 54.3 51.1n = 3 61.3 55.2 52.1 77.3 70.4 67.2 95.1 86.9 82.8n = 4 95.4 87.7 83.6 116.6 106.9 102.6 136.1 126.7 122.2n = 5 137.2 127.9 123.4 161.3 151.0 145.5 184.1 174.2 169.0


n = 2 35.4 29.6 27.3 48.0 42.0 38.9 59.4 53.5 50.4n = 3 60.7 54.2 50.9 77.6 70.3 66.7 94.2 86.1 82.4n = 4 95.9 86.8 82.7 112.4 105.6 101.4 135.8 126.0 121.4n = 5 135.8 125.9 121.6 158.9 149.4 144.7 185.4 173.4 168.2

35

Table 2: Estimation Results

xt y1t − 2.81− 0.68y2tτ -1.25

RegressorsEquation (30) y1t−111t−1 y1t−112t−1 y2t−111t−1 y2t−112t−1 ∆y1t−1 ∆y2t−1 11t−1 12t−1

Coefficients 0.06 -0.05 -0.03 0.03 -0.03 0.25 -0.15 0.18T Values 0.77 −3.98∗∗ -0.53 3.04∗∗ -1.32 5.91∗∗ -0.99 4.03∗∗

Ljung-Box Q Test 1.13

RegressorsEquation (31) y1t−111t−1 y1t−112t−1 y2t−111t−1 y2t−112t−1 ∆y1t−1 ∆y2t−1 11t−1 12t−1

Coefficients 0.22 -0.03 -0.23 0.03 0.40 0.30 0.19 0.06T Values 1.93∗ -1.45 −3.43∗∗ 1.97∗∗ 10.75∗∗ 5.10∗∗ 0.88 0.90Ljung-Box Q Test 0.52

Tests for Threshold CointegrationADL Test supτ∈[0.15,0.85] F

d(τ) = 128.04∗∗

Enders-Siklos Test 18.62∗∗

Note: ∗∗ significant at the 5% level, ∗ significant at the 10% level.

36

0 0.5 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Panel A

c1

Size

ADLESECM

0 0.5 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Panel B

c2

Size

ADLESECM

0 0.5 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Panel C

d1

Size

ADLESECM

Figure 1: Rejection Frequency of Various Tests under the Null Hypothesis at the 5% Level.

37

−1 −0.5 00

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Panel A, f=0

k1

Powe

r

ADLESECMADLPESP

−1 −0.5 00

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Panel B, f=0.05

k1

Powe

r

ADLESECMADLPESP

−1 −0.5 00

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Panel C, f=0.1

k1

Powe

r

ADLESECMADLPESP

Figure 2: Rejection Frequency of Various Tests under the Alternative Hypothesis at the 5%

Level.

38

0 100 200 300 400 500 6000

5

10

15

20Panel A: Time Series Plot

Obs

Fedfunds10ybond

0 100 200 300 400 500 600−4

−2

0

2

4Panel B: Error Correction Term

Obs

Figure 3: Time Series Plot

39

Documents

System-Equation ADL Test for Threshold ... - Miami University · Miami University . 2013 . Working Paper # - 2013-05. System-Equation ADL Test For Threshold Cointegration With an