19

Click here to load reader

How well does nonlinear mean reversion solve the PPP puzzle?

Embed Size (px)

Citation preview

Page 1: How well does nonlinear mean reversion solve the PPP puzzle?

Journal of International Money and Finance 29 (2010) 919–937

Contents lists available at ScienceDirect

Journal of International Moneyand Finance

journal homepage: www.elsevier .com/locate/ j imf

How well does nonlinear mean reversion solvethe PPP puzzle?

Stephen Norman*

University of Washington, Tacoma, Milgard School of Business, 1900 Commerce St., Campus Box 358420, Tacoma, WA 98401, USA

JEL classification:C22F31

Keywords:Nonlinear impulse response analysisPurchasing power parityHalf lifeSmooth transition autoregressive modelReal exchange rates

* Tel.: þ1 253 692 4827; fax: þ1 253 692 4523.E-mail address: [email protected]

0261-5606/$ – see front matter � 2010 Elsevier Ltdoi:10.1016/j.jimonfin.2010.01.009

a b s t r a c t

This paper addresses the degree to which models which exhibitnonlinear mean reversion (NMR) present a resolution to thePurchasing Power Parity Puzzle. This paper develops a method ofestimating a representative distribution of half lives which is basedupon the observed distribution of shocks in a given time seriesrather than choosing shock sizes arbitrarily which is the currentpractice in the literature. This approach is implemented with dataon five real exchange rates. The empirical analysis shows that halflives shorter than the consensus are observed frequently enough tosupport the proposition that NMR is a solution to the PPP puzzle.

� 2010 Elsevier Ltd. All rights reserved.

1. Introduction

The concept of purchasing power parity (PPP) is one of the most empirically well studied theories ininternational economics, perhaps because evidence of its existence has been so elusive. Specifically, thewell known ‘‘Purchasing Power Parity Puzzle’’ (Rogoff, 1996), refers to the consensus that deviationsfrom PPP seem to be overly persistent despite the fact that the real exchange rate is very volatile. Thehalf lives of shocks to PPP, which have been estimated to be on the order of three to five years (Rogoff,1996), seem to be extremely long even when PPP is viewed as a long run concept. This has troubledmany researchers who believe that some form of this theory should hold given the opportunity ofarbitrage in international goods markets.

Modeling the dynamics of real exchange rates using models which exhibit nonlinear mean rever-sion (NMR) is one potential solution to the PPP puzzle that has received much attention over the past

d. All rights reserved.

Page 2: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937920

several years. Studies which have found evidence of NMR in real exchange rates include Sarno et al.(2004), Taylor et al. (2001), Baum et al. (2001), Taylor and Peel (2000) and Michael et al. (1997).Despite the evidence supporting the statistical significance of nonlinear mean reversion in realexchange rates, establishing a good understanding of the economic significance of these findings hasreceived only limited attention in the literature. The purpose of this paper is to address how oftenspeeds of mean reversion are observed in real exchange rates which would lead to half lives shorterthan the three- to five-year consensus identified by Rogoff (1996). This paper shows that half livesshorter than the consensus are observed frequently enough to support the proposition that NMR isa solution to the PPP puzzle.

Previous studies including Taylor et al. (2001), Sarno and Valente (2006), and Lothian and Taylor(2008) have sought to address how NMR in real exchange rates affects estimated half lives. While ithas been found that large shocks produce half lives that are well below three years, the observedfrequency of such shocks has not been addressed. The main contribution of this paper is the devel-opment of a method to characterize a real exchange rate’s empirical distribution of half lives.1 Thisprocedure is based upon using the observed distribution of shocks to the real exchange rate rather thanchoosing shock sizes arbitrarily. The estimated distribution of half lives allows one to estimate thepercentage of time a half life would be observed below the three- to five-year consensus.

This paper also addresses the use of impulse response analysis, which produces the half life, whenusing models which exhibit nonlinear mean reversion. Compared with the linear case, estimating halflives in the presence of NMR is more complicated because the exact shape of the impulse responsefunction depends on the initial conditions at the time of the shock and the size of the shock itself. It isargued here that setting the initial condition equal to the long run mean and using the observeddeviations from the mean as sample shocks is the most appropriate method when estimatingnonlinear half lives of real exchange rates.

The proposed methods are applied to modeling NMR in five US based real exchange rates usinga STAR model. The results suggest that NMR produces half lives lower than the three- to five-yearbenchmark in a meaningful percentage of the time. Half lives less then five years are very frequent,with observed half lives shorter than five years occurring 100% of the time in three of the five realexchange rates. Half lives less than three years occur on average 30% of the time. These results showthat NMR in real exchange rates produces half lives that are shorter than the consensus with a non-trivial frequency. These findings bolster the literature which supports the idea that NMR is a solu-tion to the PPP puzzle.

The remainder of the paper is as follows: Section 2 reviews two popular approaches to nonlinearimpulse response analysis. Section 3 develops a method of implementing impulse response analysiswhen using a model that exhibits NMR. Section 4 provides a discussion of characterizing the distri-bution of half lives in the context of NMR. Section 5 contains an empirical application using a smoothtransition autoregressive model and monthly real exchange rate data. Section 6 concludes the paper.

2. Nonlinear impulse response analysis

Implementing impulse response analysis in a nonlinear setting follows at least two mainapproaches provided by Gallant et al. (1993) (hereafter GRT) and Koop et al. (1996) (hereafter KPP).2

These methods address the two major difficulties with estimating impulse response functions in thecontext of a nonlinear model. The first complication is that the shape of the impulse response functionis dependent on the initial condition. In the context of NMR, this can be seen by noting that if the initialcondition is chosen near the mean, where the process is very persistent, the impulse response functionwill revert to zero very slowly. On the other hand, if the initial condition is chosen such that it is farfrom the mean, the impulse response function will tend to zero quickly. The other complication is thefact that the size of the shock itself will also influence the shape of the impulse response function. The

1 To be clear, this paper does not address the sampling distribution of the half life estimator. Rather, the estimated distri-bution of half lives is based upon the distribution of observed deviations from PPP.

2 While both approaches focus on multivariate models, in this paper only univariate methods will be addressed.

Page 3: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 921

reasoning is similar to that given with respect to the dependency of the initial condition: larger shockswill imply shorter half lives because the process will be further away from the mean where there isfaster mean reversion. In linear models, the initial condition and size of the shock do not affect the sizeof the estimated half life.

2.1. Gallant, Rossi, and Tauchen approach

GRT describe the impulse response function of a nonlinear model as the difference in the condi-tional mean perturbed by some shock and the unperturbed conditional mean. Following GRT, suppose{yt} is a stationary process with a conditional density, f(yjx), that depends on p lags, with the lags of yt

denoted as xt ¼ (yt�1,.,yt�p). The conditional mean of ytþj given the initial condition x0 is

byjðx0Þ ¼ Ehytþj

���xt ¼ x0

i¼Z

.

Z "Yj�1

i¼0

f ðyiþ1jxiÞ#

dy1.dyj�1: (1)

Because the analytical solution to (1) is unattainable given most nonlinear models, Monte Carlointegration is usually used to estimate the conditional mean above.

The conditional mean is perturbed by a scalar shock, d, in the following manner. Given an initialcondition, x0, define x0* h (y�1þ d,.,y�p). The impulse response function at time tþ j is then defined as

IRFðj; d; x0Þhbyj�x�0�� byjðx0Þ:

Note that the impulse response function (IRF) depends on the size of the shock d and the initialcondition x0.

GRT suggest that one ‘‘inspect a scatter plot of the data.and visually determine shocks.thatappear typical relative to the historical dispersion of the data,’’ (p. 877). This is critical considering thatthe impulse response function should be based on shocks that are representative of the data. Forexample, in the case of NMR models, it would be possible to generate half lives of almost any size bysimply choosing the size of the shock to be sufficiently large or small. GRT propose two strategies todeal with the dependence of the impulse response function on the choice of the initial condition orhistory of the time series. The first is to simply set the initial condition, x0, to E[xt]. The second is tomodify the impulse response function to be conditional on the ‘‘average history’’ of the time series. Thisis done by drawing x0 from the empirical distribution of xt, computing an impulse response function foreach drawing, and then averaging across all impulse response functions.

2.2. Koop, Pesaran, and Potter approach

KPP define the ‘‘traditional impulse response function’’ as

IRFðj;d;ut�1ÞhEhytþj

��vt ¼ d;vtþ1 ¼ 0;.;vtþj ¼ 0;ut�1

i�Ehytþj

��vt ¼ 0;vtþ1 ¼ 0;.;vtþj ¼ 0;ut�1

iwhere the model is given by

yt ¼ F�

yt�1;.; yt�p

�þ vt ;

yt is assumed to be a Markovian process, and ut�1 is the information set at time t � 1. The depen-dence of the impulse response function on ut and d reflects the relevance of the history of the processand the size of the shock.

To deal with the problem of choosing the initial condition and size of the shock, KPP utilize thegeneralized impulse response function (GIRF),

GIRFðj; vt ;Ut�1Þ ¼ Ehytþjjvt ;Ut�1

i� EhytþjjUt�1

i; (2)

where Ut�1 is the random variable of which ut�1 is a specific realization. KPP note that because (2) isthe difference of two expectations conditional on random variables, GIRF itself is a random variable.

Page 4: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937922

They propose that a natural choice for the distributions of vt and ut�1 are those that generate the timeseries itself. It is also possible to condition on a specific history and/or a specific shock:

GIRFðj; vt ;Ut�1 ¼ u0Þ ¼ Ehytþjjvt ;Ut�1 ¼ u0

i� EhytþjjUt�1 ¼ u0

i(3)

GIRFðj; vt ¼ d;Ut�1Þ ¼ Ehytþjjvt ¼ d;Ut�1

i� EhytþjjUt�1

i(4)

GIRFðj; vt ¼ d;Ut�1 ¼ u0Þ ¼ Ehytþjjvt ¼ d;Ut�1 ¼ u0

i� EhytþjjUt�1 ¼ u0

i: (5)

KPP note that the expected value of the unconditional GIRF, in (2), and the GIRF only conditional ona specific history, in (3), is zero while the GIRF conditional on a specific shock (equations (4) and (5)) isin general not zero. KPP focus their attention on the use of the GIRF as a random variable, and as such,only condition on a specific history or a specific shock, not both.

Contrasting their work with that of GRT, KPP note that when GRT condition their IRF on the ‘‘averagehistory,’’ this is similar to simply looking at the expected value of (4),

E½GIRFðj; vt ¼ d;Ut�1Þ� ¼ Ehytþjjvt ¼ d

i� Ehytþj

i:

while the average value may be of interest in reporting a representative IRF, KPP observe that theexpected value of a nonlinear impulse response ‘‘can hide a great deal of important information,’’ (p.131), because the effect of shocks may be dependant on the history of the process. This is particularlytrue of nonlinearly mean reverting models because in the presence of NMR it follows that shocks areless persistent the farther the process is from its mean.

3. Impulse response analysis for NMR models

3.1. Modeling shocks

There are at least two separate perspectives regarding how to model a shock taking place at time t,dt. Consider the following possibly nonlinear model,

yt ¼ Fðxt ; qÞ þ 3t ; (6)

where xt ¼ (yt�1,.,yt�p) and q is a set of parameters. First dt could be viewed as the total effect of theindividual innovations (3t,.,31), that have moved the process away from its mean up to time t:

dt ¼ E½yt � � yt : (7)

In an AR(1) model with no intercept and an autoregressive parameter r it would follow that dt¼ E[yt]� (3t þ r3t�1 þ r23t�2 þ .). It might be more appropriate to call this a deviation rather than a shock.Second, dt could be viewed as the value of the random component of a time series that occurs in anygiven time period:

dt ¼ yt � Fðxt ; qÞ ¼ 3t : (8)

Under this perspective the size of the shock is equal to the value of the error term.The practical differences between these two modeling perspectives are evident when one considers

the selection of a representative or typical shock. If one modeled shocks according to deviationperspective in (7), representative shocks could be identified by examining the dispersion of the indi-vidual observations of the data with respect to the mean. On the other hand, if the shock were modeledaccording to (8) then the residuals would provide shocks that would be typical of the data. The fact thatGRT suggest inspecting the scatter plot of the data to facilitate the determination of the shock size,suggests that they choose to view shocks as the accumulation of individual one period innovations. KPPon the other hand draw from the residuals to choose a representative shock, which implies that theyadopt the contrasting perspective.

In the context of nonlinear mean reversion, the perspective adopted in this paper is that thequestion of how long it should be expected for a process to return to its long run equilibrium is more

Page 5: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 923

relevant than how persistent are one period innovations. The motivation for this belief is that thecentral issue under investigation in this paper is modeling reversion towards the mean. Consequently,modeling shocks according to the perspective of being deviations from the equilibrium is moreappropriate than focusing on one period innovations. As a result, the half lives measured in this paperaddress the question of how long it is expected to take a process to revert back half the distance to longrun equilibrium value. This is also the view that Taylor et al. (2001) adopt by using percentage shocks tothe level of the real exchange rate of 1%, 5%,.30%, 40%. Paya and Peel (2006) is another example of theuse of this perspective. From this point forward the word ‘‘shock’’ will be synonymous with ‘‘deviationfrom the mean.’’

3.2. Selecting the initial condition

The other concern when formulating a nonlinear impulse response function is the choice of theinitial condition, xt ¼ (yt�1,.,yt�p). Again there seems to be at least two major approaches to thisproblem. The first approach sets the elements of the vector of initial conditions equal to specific values,and the second allows the impulse response function to be representative of the data in terms of theinitial condition. It was shown in the previous section that GRT propose the use of both methods, whileKPP focus mostly on the second approach. Under the perspective that a shock is an accumulation ofinnovations that drive a process away from its mean, it would make sense to fix the initial conditionwhich receives the shock, yt�1, to a value equal to the mean. Otherwise with shocks modeled after (7),using an initial condition other than the mean would be equivalent to incorporating a perturbation intothe initial condition without including a shock.

GRT suggest setting all the elements of xt equal to E(yt). Setting each element of the vector of initialconditions equal to the mean has the drawback of not being representative of the data unless p¼ 1. If p> 1, an alternative approach would be to find values of (yt�2,.,yt�p) that are ‘‘typical’’ given yt�1¼ E(yt).One method of doing this that does not rely on estimating the joint distribution, F[yt�2,.,yt�pjyt�1 ¼E(yt)], would be to produce artificial data using the estimated model as the data generating process andfind observations, yt*, that are sufficiently close to the mean, i.e.

y�t s:t:ðm� sÞ � y�t � ðmþ sÞ; (9)

where s is a small number. Given such a yt*, the elements of xt would be yt* and its p lags. To estimatea nonlinear impulse response function using Monte Carlo integration based on N replications, onewould generate data until N individual observations were identified that satisfied (9).3 These obser-vations and their p lags would form a set of initial conditions sufficient to calculate the impulseresponse function. This approach would be more in line with the suggestion of KPP to treat the initialcondition as a random variable and is the approach used in this paper.

3.3. Previous research

Lothian and Taylor (2008) and Taylor et al. (2001) both follow the approach of GRT in estimating halflives produced in a STAR framework using data on real exchange rates. Lothian and Taylor (2008) usea multivariate framework and data starting in 1820 to test for Harrod–Balassa–Samuelson effects whileallowing for nonlinear mean reversion and changes in volatility across nominal exchange rate regimes.Taylor et al. (2001) use a univariate STAR model along with monthly data from the post-Bretton Woodsperiod for four US based real exchange rates. Both studies follow a similar approach. Shocks of size 40%,30%, 20%, 10%, 5%, and 1% are used. The half lives are estimated based on both a fixed initial historyequal to the estimated mean of the process and on the average initial history. The results in the twopapers coincide with each other and show that the models used produce half lives that are muchshorter than three years.

In both papers the half lives produced when the impulse response function is based on the averageinitial history are all substantially smaller than the half lives based upon an initial exchange rate

3 See the Appendix for an explanation of how Monte Carlo integration was used in this paper.

Page 6: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937924

equilibrium. The reason for this is that given a certain shock size, setting the initial condition equal tothe mean will produce the longest possible half life. Thus, because the same shock sizes are used inboth cases it must necessarily be that the half lives based upon the average initial history are shorterthan those based upon the initial equilibrium. Further, if the shocks are representative of the dispersionof the data from the mean then using the same shocks with an average initial condition would producean impulse response function which is not characteristic of the data. To see this note that if yt�1 is theinitial condition which receives a shock and dMAX is the size of the shock which is equal to the distancebetween the most extreme value of the data (yMAX) and the mean (m) then

yMAX ¼ yt�1 þ dMAX ¼ mþ dMAX : (10)

If dMAX were used to calculate an impulse response function based upon an average initial conditionthen it would be true that

yt�1 þ dMAX � yj for all j; t ¼ 1;.; T: (11)

In other words, dMAX added to any representative initial condition would be larger than any obser-vation. As a result, while dMAX is typical of extreme deviations from the mean, it would not produceimpulse response functions that are characteristic of the process when based upon an average initialcondition. This illustrates the importance of the discussions in Sections 3.1 and 3.2.

Sarno and Valente (2006) is another study which estimates half lives to real exchange rates ina nonlinear framework. They use a long span of data on four US based real exchange rates and employa vector error correction model while also controlling for different nominal exchange rate regimes.While Sarno and Valente (2006) only condition on the average history of the process, they do restricttheir attention to shocks with a maximum size of 20% given that larger shocks are very rare in the datathey study.

4. Characterizing the distribution of half lives

Assuming the initial condition that receives the shock is equal to the estimated mean and thatshocks are modeled as an accumulation of one period innovations, the remaining question is whetherto use fixed values of shocks or to model the impulse response as a random variable conditioned on theempirical distribution of the shocks. The advantage of using fixed shocks is the ability to identifyvarying speeds of mean reversion. Using fixed shocks though does not provide information regardingwhat is characteristic of the process in general, i.e. how common are each size of shocks. On the otherhand, the expected value of the impulse response function, conditional on the distribution of shocks,provides a representative impulse response function, but does not characterize differing speeds ofmean reversion.

In addressing the question of whether or not NMR is a solution to the PPP puzzle, it would beinformative to gain an understanding of the following: 1) What speeds of mean reversion areobserved in deviations from PPP and, 2) How common or rare they are. The problem as GRT observe isthat ‘‘it is clearly impractical to report the impulse response sequences for many different [initialconditions],’’ (p. 887), or shocks for that matter. This issue can be ameliorated by focusing on the halflife and not the impulse response function. Because the half life is a single number, and not a function,it feasible to report all half lives associated with many different sizes of shocks. Given a sample of

Table 1Tests for heteroskedasticity and linearity.

d Lags ARCH(1) ARCH(6) ARCH(12) LM2 LM4

France 12 1 0.60 2.27 3.95 2.62** 2.55**Germany 12 1 1.73 3.62 8.43 3.13** 2.88**Italy 9 1 1.98 7.99 12.78 2.38* 2.99**Japan 4 1.2 8.24** 17.50** 18.32 0.81 2.07**UK 6 1.12 11.07** 20.07** 24.23** 1.86* 2.47**

LM2 and LM4 are the F-tests for nonlinearity described in van Dijk et al. (2002). ** and * denote significance at the 5% and 10%levels respectively.

Page 7: How well does nonlinear mean reversion solve the PPP puzzle?

Table 2STAR estimation.

a1 a2 g m s

France 1.024 (0.016) 0.162 (0.053) 1.738 (0.021) 0.031Germany 1.032 (0.017) 0.168 (0.047) 0.462 (0.019) 0.032Italy 1.017 (0.019) 0.128 (0.053) 7.434 (0.024) 0.030Japan 1.079 (0.009) �0.075 (0.009) 0.045 (0.037) 4.690 (0.050) 0.032UK 1.014 (0.021) �0.016 (0.019) 0.098 (0.069) �0.417 (0.036) 0.030

Notes: Standard errors are given in parentheses. In the case of heteroskedasticity, heteroskedastic robust standard errors areused. The value of s corresponds to the standard error of the regression.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 925

shocks which are representative of the time series under question, after computing the associatedhalf lives, it would then be possible to estimate the distribution of associated half lives using kerneldensity estimation.

More formally, using the notation of GRT, a half life, h, is characterized by the following expression:

min½h�s:t: E½ytþhjyt�1 ¼ mþ d� � E½ytþhjyt�1 ¼ m� � d

2: (12)

Replacing d in (12) with D, where D is the random variable of which d is a single realization, wouldproduce a random variable H which represents the distribution of half lives produced by the NMRprocess. The distribution of H can be estimated by carrying out the following steps:

1. Specify and estimate the NMR model.2. Calculate the shock associated with each observation yt, t ¼ 1,.,T, as dt ¼ yt � bm where bm is the

estimated mean of the process.3. For each shock, calculate the associated impulse response function using the Monte Carlo inte-

gration method described in the Appendix.4. The half life corresponding to each shock is then calculated according to (12).5. Draw with replacement from the set of shocks and associated half lives.6. Given the set of half lives produced in the previous step, one can plot the empirical CDF of H or

estimate the PDF of H using kernel density estimation.

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.61

1.5

2

2.5

3

3.5

4

4.5

5

Shock (%)

Hal

f Life

(Yea

rs)

Fig. 1. Half Life vs. Shock Size - France.

Page 8: How well does nonlinear mean reversion solve the PPP puzzle?

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.61

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

Shock (%)

Hal

f Life

(Yea

rs)

Fig. 2. Half Life vs. Shock Size - Germany.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937926

5. An illustration of calculating half life distributions through an empirical application

5.1. The smooth transition autoregressive model

Two of the most popular models used to study nonlinear mean reversion in real exchange rates arethe threshold autoregressive (TAR) model and the smooth transition autoregressive (STAR). Bothmodels describe a process whose dynamics are a convex combination of at least two different regimes.Given that PPP is based upon studying the dynamics of real exchange rates, which are calculated usingan aggregate price index, there is little reason to believe that the transition between different rates ofmean reversion is discrete. Thus, the STAR model is generally preferred over the TAR model whenstudying PPP because the transition between slow mean reversion and fast mean reversion is smooth.

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.61

1.5

2

2.5

3

3.5

4

4.5

5

Shock (%)

Hal

f Life

(Yea

rs)

Fig. 3. Half Life vs. Shock Size - Italy.

Page 9: How well does nonlinear mean reversion solve the PPP puzzle?

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.81.5

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

Shock (%)

Hal

f Life

(Yea

rs)

Fig. 4. Half Life vs. Shock Size - Japan.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 927

A general transition model used in this context can be expressed as:

yt¼�

a0þa1yt�1þ.þapyt�p

�þhðb0�a0Þþðb1�a1Þyt�1þ.þ

�bp�ap

�yt�p

iRðzt ;qÞþ3t : (13)

Whether the time series process follows an AR(p) model parameterized by a ¼ (a0, a1,.,ap), b ¼ (b0,b1,.,bp), or some convex combination of the two regimes is governed by the transition function, R($),which is itself a function of some transition variable, zt, and a set of parameters, q.

In the smooth transition model, R(zt, q) is a smooth function bounded between zero and one,R : R/½0;1�. The value of the transition function determines the proportion of each regime present inthe dynamics of the process depending on the value of the transition variable. The most populartransition function when modeling NMR is the exponential function,

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.61

1.5

2

2.5

3

3.5

Shock (%)

Hal

f Life

(Yea

rs)

Fig. 5. Half Life vs. Shock Size - UK.

Page 10: How well does nonlinear mean reversion solve the PPP puzzle?

1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Half Life (Years)

Fig. 6. Empirical CDF of Half Lives - France.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937928

Rðzt ; qÞ ¼ 1� exph��

g=bszt

�ðzt � mÞ2

i: (14)

In this STAR model, q ¼ (g, m). Larger values of g are associated with faster transitions. As suggestedby Granger and Terasvirta (1993), the parameter g is divided by the sample standard deviation of thetransition variable in order to speed convergence of the estimation algorithm and allow comparisons ofestimates of g across equations.

An important sub-class of STAR models is the ‘‘self-exciting’’ variety. A STAR model is self-exciting ifthe transition variable is a lagged dependent variable, zt ¼ yt�d. In the context of mean reversion inrelative prices, like real exchange rates, most threshold or transition models used are self-exciting.

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Half Life (Years)

Fig. 7. Empirical CDF of Half Lives - Germany.

Page 11: How well does nonlinear mean reversion solve the PPP puzzle?

1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Half Life (Years)

Fig. 8. Empirical CDF of Half Lives - Italy.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 929

While the choice of a transition variable is part of the modeling process, many researchers studying PPPlimit their choice of transition variables to lagged real exchange rates. The motivation for this is that ifone believes that arbitrage is the main force driving real exchange rates to their equilibrium level, anymean reversion that is taking place now should exist because some past value of the real exchange ratewas such that arbitrage was feasible. The amount of time that it takes for economic agents to respond tolarge deviations from PPP determines the value of the lag of the transition variable.

One simple model that could be used in the context of PPP is given by:

yt � m ¼ r1ðyt�1 � mÞ þ ðr2 � r1Þðyt�1 � mÞRðyt�1; qÞ þ 3t (15)

If the theory of real exchange rates in the presence of transportation costs holds then one wouldexpect that r1 ¼ 1 and r2 < 1. The threshold function would also need to be an inversely bell shaped

1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Half Life (Years)

Fig. 9. Empirical CDF of Half Lives - Japan.

Page 12: How well does nonlinear mean reversion solve the PPP puzzle?

1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Half Life (Years)

Fig. 10. Empirical CDF of Half Lives - UK.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937930

function that is bounded between one and zero. The exponential function in (14) has these charac-teristics, and consequentially is almost exclusively used to model nonlinear mean reversion.

5.2. Testing for nonlinearity

The concept of testing linearity in the STAR framework is complicated by the fact that there areunidentified nuisance parameters under the null hypothesis of linearity. This can be seen in one oftwo ways. If the model is truly linear then either the parameter governing the rate of transition is zero,g ¼ 0, or there is no difference between the autoregressive dynamics between regimes, b ¼ (0,.,0). Ifthe former is true then the parameters governing the autoregressive dynamics are unidentified

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Half Life (Years)

Fig. 11. Estimated Half Life PDF - France.

Page 13: How well does nonlinear mean reversion solve the PPP puzzle?

0 1 2 3 4 5 6 70

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Half Life (Years)

Fig. 12. Estimated Half Life PDF - Germany.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 931

because it doesn’t matter what they are if there is no transition between regimes. On the other hand, ifthere is no difference between regimes then different values of the parameter governing the rate thetransition will not change how the model fits the data.

To deal with the problem of nuisance parameters in the STAR framework, Saikkonen and Luukkonen(1988) propose replacing the transition function with a second order Taylor series approximationaround g¼ 0. van Dijk et al. (2002) explain that, given this approximation, the error in the Taylor seriesapproximation is then a part of the regression error term. Under the null, the Taylor approximationerror would be zero, and as a result the properties of the error term would not be affected. The test ofnonlinearity would then simply be a test that the coefficients on the variable affected by the Taylorapproximation are zero. Specifically,

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.50

0.2

0.4

0.6

0.8

1

1.2

1.4

Half Life (Years)

Fig. 13. Estimated Half Life PDF - Italy.

Page 14: How well does nonlinear mean reversion solve the PPP puzzle?

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Half Life (Years)

Fig. 14. Estimated Half Life PDF - Japan.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937932

Rðzt ; qÞ ¼ d0 þ d1zt þ d2z2t þ Tðzt ; qÞ (16)

would be the Taylor approximation with T(zt;q) as the Taylor remainder term. Testing linearity is thenbased upon the auxiliary regression

yt ¼ f00xt þ f01xtzt þ f02xtz2t þ 3�t (17)

where xt¼ (1, yt�1,.,yt�p) and 3t*¼ 3tþ T(yt�1;q)(b� a)xt. The linearity test can then be stated as H0: f1

¼ f2¼ 0 against the alternative that H1: f1 s 0 or f2 s 0. van Dijk et al. (2002) also note that Escribanoand Jorda (1999) claim that it may be necessary to use a higher order Taylor approximation. Their test isbased upon the following regression:

0.5 1 1.5 2 2.5 3 3.50

0.5

1

1.5

2

2.5

Half Life (Years)

Fig. 15. Estimated Half Life PDF - UK.

Page 15: How well does nonlinear mean reversion solve the PPP puzzle?

Table 3Descriptive statistics of half lives, h.

Mean (years) Median (years) P(h < 5) P(h < 3) P(h < 2) P(h < 1)

France 3.35 3.50 1.00 0.37 0.15 0.00Germany 3.79 4.08 0.77 0.34 0.16 0.00Italy 3.95 4.42 1.00 0.21 0.12 0.00Japan 5.17 5.50 0.38 0.02 0.01 0.00UK 2.76 2.92 1.00 0.60 0.11 0.00

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 933

yt ¼ f00xt þ f01xtzt þ f02xtz2t þ f03xtz3

t þ f04xtz4t þ 3t : (18)

The null hypothesis of this test is H0: f1 ¼ f2 ¼ f3 ¼ f4 ¼ 0. van Dijk et al. (2002) also report thatneither test appears to dominate the other in terms of power.

An additional complicating factor in testing linearity in this context stems from the difficulty ofrejecting the null hypothesis of a unit root in real exchange rates. The linearity test used to detect STARtype nonlinearity is dependent on the assumption that the process is stationary. Further, even whenthere are no unidentified parameters under the null hypothesis of linearity, conventional inferencedealing with the STAR parameters estimated by NLS is also dependent on the process being stationary.Given that this paper concerns itself with the comparison of half lives produced by linear and nonlinearmodels and not the stationarity of real exchange rates, the issue of testing for the existence of a unitroot in the real exchange rates under study will not be addressed. This is in line with Murray and Papell(2002, p. 3) who, in a paper studying half lives of shocks produced in linear models in a PPP context, are‘‘not concerned with the statistical question of whether unit roots in real exchange rates can be rejectedand report no statistics.’’

5.3. Specifying the STAR model

When applying the exponential STAR model as described in (13) to the analysis of possiblenonlinear mean reversion in real exchange rates, the parameters b¼ (b0, b1,.,bp) are usually not freelyestimated (see Michael et al., 1997; Taylor et al., 2001; Kapetanios et al., 2003). One common constraintmade in this context is

b ¼�b0; b1;.; bp

�¼ ð0;0;.; 0Þ: (19)

This implies that larger deviations will always imply a faster rate of mean reversion. The STAR modelsused in this paper are also specified with this constraint. An additional constraint used here is thatreversion takes place towards the mean. Thus, the exponential STAR model that will be used in thispaper is given by:

yt�m¼a1ðyt�1�mÞþ.þap

�yt�p�m

��ha1ðyt�1�mÞþ.þap

�yt�p�m

�iRðyt�d;m;gÞþ3t ;

(20)

where R(yt�d; m, g) is the exponential function given in (14).4

The modeling procedure begins by specifying a linear AR model of order p by inspecting the partialautocorrelation function and then adding additional lags as needed in order to fail to reject the nullhypothesis of no autocorrelation of the residuals up to order 12 using the Breusch–Godfrey test. vanDijk et al. (2002) suggest that the candidate transition variable that rejects linearity most stronglyusing the test in (17) should be chosen as the transition variable to be employed in the estimation of the

4 Following van Dijk et al. (2002) the parameter g was divided by bsyt�d , the sample standard deviation of the transitionvariable, to make it approximately scale free. In addition, the starting values for g and (a0, a1,.ap), were found by performinga grid search over a wide range of g while constraining m ¼ yt, the sample mean of the real exchange rate, and thenconcentrating the sum of squares function.

Page 16: How well does nonlinear mean reversion solve the PPP puzzle?

Table 4Linear half lives (hL).

AR(1) hL P(h < hL)

France 0.976 2.33 0.22Germany 0.978 2.58 0.27Italy 0.978 2.63 0.15Japan 0.982 3.17 0.04UK 0.979 2.77 0.39

Notes: AR(1) refers to the first order autoregressive coefficient.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937934

model. To avoid possible data-mining issues, this paper uses an approach closer to that of Hansen(1997) who, in the context of modeling TAR processes, suggests selecting the transition variable thatminimizes the sum of squared errors based on the estimated TAR model. This paper will select thetransition variable by estimating the equation (17) for d ¼ 1, 2,.,12, and choosing the one which hasthe smallest sum of squared errors.5 The STAR model with p lags and yt�d as the transition variable isthen estimated using NLS.

5.4. Data

This paper uses monthly real exchange rate data for five US based country pairs, using data fromFrance, Germany, Italy, Japan, and the United Kingdom (UK) over the post-Bretton Woods period from1973:1 to 1998:12 for France, Germany, and Italy and from 1973:1 to 2007:12 for Japan, and UK. Thedata was obtained from the International Monetary Fund’s online International Financial Statisticsdatabase.6 The real exchange rates (yt) used were constructed as

ythst þ pt � p�t

where st is the logarithm of the nominal exchange rate (foreign price of domestic currency), pt is thelogarithm of the domestic consumer price level, and pt* is the logarithm of the foreign consumer pricelevel. US is treated as the domestic country.

5.5. The distribution of half lives in real exchange rates

Table 1 presents the results of delay parameter selection, lag selection, heteroskedasticity tests, andtests for nonlinearity for the five exchange rates using the procedures described previously. In eachcase there is evidence of nonlinearity in the five exchange rates. The results of the estimation of theexponential STAR models, as specified in (20), are presented in Table 2. Half lives were calculatedaccording to the procedure given in Section 4.7 The half lives for each shock are plotted against thevalue of the associated shocks in Figs. 1–5. These figures illustrate the range of half lives that aredetermined by the distance the exchange rate from the estimated mean. Next, 10,000 draws withreplacement were taken from the set of possible shocks and the associated half lives. This sampleformed the basis for the estimation of the distribution of the half lives. Graphs of the empirical CDFs arepresented in Figs. 6–10. The half life PDF for each exchange rate was also estimated using the kernelsmoothing function with the normal kernel and a bandwidth that is optimal for estimating normaldensities. The estimated PDFs are plotted in Figs. 11–15.

Descriptive statistics of the estimated distributions are given in Table 3. Comparing the estimatedhalf life distributions in this paper to the three- to five-year benchmark indicates that a largeproportion of observed half lives are smaller than five years. The average percentage of half lives

5 The first 12 observations were reserved in each case so that the sample size remained constant in each regression.6 The German CPI was taken from Datastream�.7 The value of s in expression (9) used to generate the initial conditions was bsyt=1000, where bsyt is the sample standard

deviation of the real exchange rate.

Page 17: How well does nonlinear mean reversion solve the PPP puzzle?

4 4.2 4.4 4.6 4.8 5 5.2 5.40.94

0.95

0.96

0.97

0.98

0.99

1

1.01

yt

Auto

regr

essi

ve F

unct

ion

LinearESTAR

Fig. 16. Autoregressive Function - Japan.

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 935

smaller than three years is almost 30%. These results suggest that NMR does produce half lives that arebelow the consensus benchmark in a non-trivial degree of frequency. Even half lives less than two yearsoccur between 10% and 16% of the time in four of the five exchange rates. The estimation results suggestthat Japan has the slowest rate of mean reversion. In the case of Japan, only 38% of the observed halflives are below five years and 2% are below three years. The estimated half life distributions show thathalf lives that are shorter than a year are almost never encountered in any of the real exchange ratesstudied. Interestingly, the mean and median values of the five half life distributions are all within orclose to the three- to five-year range.

A simple AR(1) model was also estimated for each real exchange rate, and the associated linear halflife was calculated. Table 4 provides the percent of nonlinear half lives that are less than the linear halflife. The average of these percents among the five country pairs is just over 20%. In other words, onaverage, the STAR model produces half lives that are shorter than a simple linear model more than onefifth of the time.8 While this is not a large percentage it should be remembered that that in the contextof NMR small half lives are expected only when the deviations are large from an economic standpointand may not occur often.

In the case of Japan, half lives produced by NMR that are less than the linear half life are onlyobserved 4% of the time. This is dramatically less than the other four real exchange rates. Comparingthe autoregressive functions of the linear and nonlinear models point to a likely cause of this irregu-larity. The autoregressive function (AF) is defined as

AFðyt�dÞ ¼Xp

i¼1

ai½1� Rðyt�d; m;gÞ� (21)

where the ai’s are the estimated autoregressive coefficients from the linear model and the estimatedcoefficients from the model in (20) in the nonlinear case. In the linear model, because the transition

8 The nonlinear half lives were also compared to half lives based upon median unbiased estimates of the AR(1) coefficient.The vast majority of observed half lives were less than the median unbiased estimate of the half life in all cases. As Paya andPeel (2006) have discovered, the ESTAR model also suffers from an estimation bias which produces speeds of mean reversionwhich are two fast. Because there is currently no method to overcome this bias in the ESTAR estimation process, the biasedAR(1) coefficients were used to form a more appropriate comparison between the linear and nonlinear half lives.

Page 18: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937936

function is identically equal to zero, AF is equal to the sum of the individual coefficients. In both casesAF indicates the speed of mean reversion at any given point in time. The linear and nonlinear autor-egressive functions are plotted together in Fig. 16. In the case of Japan, it appears that there are a smallamount of observations which are associated with relatively fast speeds of mean reversion. The linearAF is most likely being pulled down by these influential observations. Dropping the six most extremevalues of yt�1 in a simple AR(1) regression results in a half life of 4.39 years which is a dramatic increasecompared with the half life from the complete time series. The proportion of half lives less than thisvalue is 26.4%, which is in line with the values for the other real exchange rates. In this sense the linearhalf life for Japan is ‘‘too small’’ given that it is being strongly influenced by a small amount of extremeobservations.

6. Conclusion

This paper addressed whether or not NMR, as specified with the exponential STAR model, presentsa resolution to the PPP puzzle. Despite the complications associated with implementing impulse responseanalysis in a nonlinear setting, it was shown that it is feasible to generate a distribution of observed halflives. The results from the estimated half life distributions for the five real exchange rates used in this papersuggest that NMR produces half lives shorter than the three- to five-year consensus in a meaningfulpercentage of the time. Half lives less than five years occur very frequently, while half lives less than threeyears occur around 30% of the time. Very short half lives of two years or less are less frequent. The analysisset forth in this paper thus supports the proposition that NMR is a resolution to the PPP puzzle.

Acknowledgements

I thank Tim Vogelsang, Yongmiao Hong, and Asaf Zussman for useful discussions. I am also indebtedto the useful comments from an anonymous referee and the editor. This paper was written as part ofmy Ph.D. dissertation.

Appendix: Monte Carlo integration method of calculating nonlinear impulse responsefunctions

With the p initial conditions set to zero, the estimated model is used to generate observations basedon innovations distributed as a mean zero normal distribution with variance bs2, where bs2 is theestimated variance of the error term. After the first 500 observations are generated, each observationproduced which satisfies the requirement in (9), denoted yt*, is saved along with its p lags. After 10,000such observations have been found, no additional data is generated. These 10,000 observations andtheir lags form the basis for the initial conditions (y�pþ1,.,y0), used to calculated the impulse responsefunction. For each set of initial conditions, two time series are generated from the initial conditions(y�pþ1,.,y0) and (y�pþ1,.,y0 þ d) where d is the shock used. The innovations again are distributed asa mean zero normal distribution with variance bs2. The average difference between these two seriesamong the 10,000 replications is taken as the impulse response function. Gallant et al. (1993) proposethat by the law of large numbers this procedure should produce a result very close to that which wouldbe obtained by using the analytical solution.

References

Baum, C.F., Barkoulas, J.T., Caglayan, M., 2001. Nonlinear adjustment to purchasing power parity in the post-Bretton woods era.Journal of International Money and Finance 20, 379–399.

Escribano, A., Jorda, O., 1999. Improved testing and specification of smooth transition regression models. In: Rothman, P. (Ed.),Nonlinear Time Series Analysis of Economic and Financial Data.

Gallant, A.R., Rossi, P.E., Tauchen, G., 1993. Nonlinear dynamic structures. Econometrica 61, 871–908.Granger, C.W.J., Terasvirta, T., 1993. Modelling Nonlinear Economic Relationships. Oxford University Press.Hansen, B., 1997. Inference in TAR models. Studies in Nonlinear Dynamics and Econometrics 2, 1–14.Kapetanios, G., Shin, Y., Snell, A., 2003. Testing for a unit root in the nonlinear star framework. Journal of Econometrics 112,

359–370.Koop, G., Pesaran, M.H., Potter, S.M., 1996. Impulse response analysis in nonlinear multivariate models. Journal of Econometrics

74, 119–147.

Page 19: How well does nonlinear mean reversion solve the PPP puzzle?

S. Norman / Journal of International Money and Finance 29 (2010) 919–937 937

Lothian, J.R., Taylor, M.P., 2008. Real exchange rates over the past two centuries: how important is the Harrod–Balassa–Samuelson effect? The Economic Journal 118, 1742–1763.

Michael, P., Nobay, R.A., Peel, D.A., 1997. Transaction costs and nonlinear adjustment in real exchange rates: an empiricalinvestigation. Journal of Political Economy 105 (4), 862–879.

Murray, C.J., Papell, D.H., 2002. The purchasing power parity persistence paradigm. Journal of International Economics 56, 1–19.Paya, I., Peel, D., 2006. On the speed of adjustment in estar models when allowance is made for bias in estimation. Economics

Letters 90, 272.Rogoff, K., 1996. The purchasing power parity puzzle. Journal of Economic Literature 34 (2), 647–668.Saikkonen, P., Luukkonen, R., 1988. Lagrange multiplier tests for testing non-linearities in time series models. Scandinavian

Journal of Statistics 15, 55–68.Sarno, L., Taylor, M.P., Chowdhury, I., 2004. Nonlinear dynamics in deviations from the law of one price: a broad-based empirical

study. Journal of International Money and Finance 23 (1), 1.Sarno, L., Valente, G., 2006. Deviations from purchasing power parity under different exchange rate regimes: do they revert and,

if so, how? Journal of Banking and Finance 30 (11), 3147–3169.Taylor, M., Peel, D., Sarno, L., 2001. Nonlinear mean-reversion in real exchange rates: towards a solution to the purchasing

power parity puzzles. International Economic Review 42, 1015–1042.Taylor, M.P., Peel, D.A., 2000. Nonlinear adjustment, long-run equilibrium and exchange rate fundamentals. Journal of Inter-

national Money and Finance 19 (1), 33–53.van Dijk, D., Terasvirta, T., Franses, P.H., 2002. Smooth transition autoregressive models – a survey of recent developments.

Econometric Reviews 21 (1), 1–47.