Upload
vas
View
220
Download
0
Embed Size (px)
Citation preview
8/20/2019 Jory Namrata Devi 2012 (1)
1/50
A Comparison of Neural Network, ARIMA andGARCH Forecasting of Exchange Rate Data
Namrata Devi Jory
UNIVERSITY OF MAURITIUS
Dissertation submitted to the Department of Mathematics, Faculty of Science,
University of Mauritius, as a partial fulfilment of the requirement for the degree
of BSc (Hons) Mathematics with Computer Science.
April 2012
8/20/2019 Jory Namrata Devi 2012 (1)
2/50
Contents
List of Figures iii
List of Tables iv
Acknowledgements v
Abstract vi
1 Introduction 1
1.1 Forecasting Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Aims and Organization of the Project . . . . . . . . . . . . . . . . . . . . . . . 3
2 Artificial Neural Networks 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Backpropagation Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Training set and Testing set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Linear and Non-linear Time Series 10
3.1 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 Autoregressive process . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Moving Average Process . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.3 Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 ARIMA ( p, d, q ) Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Autocorrelation and Partial Autocorrelation Functions . . . . . . . . 13
i
8/20/2019 Jory Namrata Devi 2012 (1)
3/50
CONTENTS
3.3 GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 ARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.2 GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Analysis of Exchange Rate Data 16
4.1 Analysis of Period I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.1 Analysis of Period II data . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Forecasting Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Fitting Neural Network Model to the Exchange Rates Data . . . . . . . . . . 23
4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data . . . . . . . . . . 244.5 Fitting the GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.6 Empirical findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.6.1 Forecasts performance of Period I . . . . . . . . . . . . . . . . . . . . 26
4.6.2 Results and Discussions for Period I . . . . . . . . . . . . . . . . . . . 32
4.6.3 Forecasts performance of Period II . . . . . . . . . . . . . . . . . . . . 34
4.6.4 Results and Discussions for Period II . . . . . . . . . . . . . . . . . . 39
5 Conclusion 41
Bibliography 42
ii
8/20/2019 Jory Namrata Devi 2012 (1)
4/50
List of Figures
2.1 Three-Layer feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The neuron weight adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1 Daily MUR/USD data and its corresponding first differences of the logs . . 17
4.2 Daily MUR/EU data and its corresponding first differences of the logs . . . 17
4.3 Daily MUR/GBP data and its corresponding first differences of the logs . . 18
4.4 Histogram and Kernel Density estimate of daily log return of MUR/USD . 19
4.5 Histogram and Kernel Density estimate of daily log return of MUR/GBP . 19
4.6 Histogram and Kernel Density estimate of daily log return of MUR/EU . . 19
4.7 Scatter plot of yt against yt−1 for the MUR/USD log return data. . . . . . . . 20
4.8 Scatter plot of yt against yt−1 for the MUR/EU log return data. . . . . . . . . 20
4.9 Scatter plot of yt against yt−1 for the MUR/GBP log return data. . . . . . . . 20
4.10 Daily MUR/USD Jan 03 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.11 Daily MUR/EU Jan 03 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.12 Daily MUR/GBP Jan 02 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.13 In-sample and out-of-sample forecast for MUR/USD . . . . . . . . . . . . . 33
4.14 In-sample and out-of-sample forecast for MUR/EU . . . . . . . . . . . . . . 33
4.15 In-sample and out-of-sample forecast for MUR/GBP . . . . . . . . . . . . . 34
iii
8/20/2019 Jory Namrata Devi 2012 (1)
5/50
List of Tables
3.1 Behaviour of ACF and PACF . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 Summary statistics for the daily exchange rates: log first difference . . . . . 18
4.2 Summary statistics of log first difference daily exchange rate . . . . . . . . . 21
4.3 In-sample performance of MUR/USD for data Jan 2003-Dec 2008 . . . . . . 26
4.4 In-sample performance of MUR/EU for data Jan 2003-Dec 2008 . . . . . . . 27
4.5 In-sample performance of MUR/GBP for data Jan 2002-Dec 2008 . . . . . . 28
4.6 Out-of-sample performance of MUR/USD for data Jan 2003-Dec 2008 . . . . 29
4.7 Out-of-sample performance of MUR/EU for data Jan 2003-Dec 2008 . . . . 30
4.8 Out-of-sample performance of MUR/GBP for data Jan 2002-Dec 2008 . . . . 31
4.9 First 10 forecasts values of MUR/GBP for ANN, ARIMA and GARCH models 31
4.10 In-sample performance MUR/USD data . . . . . . . . . . . . . . . . . . . . . 34
4.11 In-sample performance MUR/EU data . . . . . . . . . . . . . . . . . . . . . . 35
4.12 In-sample performance MUR/GBP data . . . . . . . . . . . . . . . . . . . . . 36
4.13 Out-of-sample performance MUR/USD data . . . . . . . . . . . . . . . . . . 37
4.14 Out-of-sample performance of MUR/EU data . . . . . . . . . . . . . . . . . 38
4.15 Out-of-sample performance of MUR/GBP data . . . . . . . . . . . . . . . . . 39
4.16 In-sample and Out-of-sample forecasts of MUR/USD . . . . . . . . . . . . . 40
4.17 In-sample and Out-of-sample forecasts of MUR/EU . . . . . . . . . . . . . . 40
4.18 In-sample and Out-of-sample forecasts of MUR/GBP . . . . . . . . . . . . . 40
iv
8/20/2019 Jory Namrata Devi 2012 (1)
6/50
Acknowledgements
I am deeply indebted to my supervisor, Professor Muddun Bhuruth, for his excep-
tional guidance and encouragement which helped me to carry out this research and write
this dissertation. It was a wonderful experience to work with such a mentor and I am very
grateful for all the criticisms to make my work better each time. I am also thankful to my
co-supervisor Associate Professor Ravindra Boojhawon who has helped and motivated me
throughout the project.
I am very grateful to my parents for their love and for always being by my side in goodand bad times and special thanks to my dearest sister for her understanding and help. I
would like to thank my friends khush, Prish and Bho for always being there for me.
I also extend my gratitude to all those who helped me directly or indirectly to make this
work a success.
Finally, I thank GOD for his blessings from the bottom of my heart.
v
8/20/2019 Jory Namrata Devi 2012 (1)
7/50
Abstract
We consider linear and non linear models for forecasting exchange rates of the Mauri-
tian Rupee against US Dollar, Euro and the British Pound. The linear models considered
are the ARIMA processes and the non linear models considered are the Artificial Neural
Networks and the GARCH model. Since no guidelines were available to choose the pa-
rameters of the Neural Network, they were chosen through extensive experimentation.
Two periods of analysis were carried out first from January 2002 to December 2008 and
for January 2002 to December 2011 and the in-sample and out-of-sample forecasts were
produced. The reason for this choice is that we wanted to test the ability of our forecastingprocedures during the financial crisis.
Using three forecast evaluation criteria RMSE, MAE and MAPE, we found that the ARMA-
GARCH model performs slightly better than the ARIMA model. The two mentioned mod-
els give better performance when compared with the ANN model for the in-sample fore-
casts of the first period data. However the ANN was found to outperform the ARIMA and
ARMA-GARCH models in the out-sample forecasting for both periods data.
vi
8/20/2019 Jory Namrata Devi 2012 (1)
8/50
Chapter 1
Introduction
Foreign exchange rates are among the most important determinants of the economic health
of a country. They describe the price of one currency in terms of another one and they
play a vital role in the trading relationship between countries, which in turn affect the
world economy. For this reason, they are the most watched, analysed and governmen-
tally manipulated economic measures. Understanding the evolution of exchange rates is
important for many essential issues in international economics and finance, such as inter-
national trade, capital flows, international portfolio management, currency option pricing,
and foreign exchange risk management. The foreign exchange market has experienced
many unexpected growth and downfall over the last few decades. The dynamics of the
exchange market depend entirely on the exchange rates. Thus the appropriate prediction
of exchange rates is a very crucial factor for the success of many businesses on the global
market. The exchange market is in itself well known for being extremely unpredictable
and volatile. A volatile exchange market makes international trade and investment deci-
sions more difficult because volatility increases exchange rate risk which may result into a
potential loss due to a change in the rates.Forecasting any time series accurately is very difficult and to add up to it, exchange
rates prediction is one of the most challenging applications of modern time series forecast-
ing. The exchange rates are generally noisy, non-stationary and deterministically chaotic
(Yaser & Atiya 1996) which suggest that there is no past behaviour information which can
produce a relation between the past and the future behaviour. However, though all the
constraints mentioned, numerous techniques have been devised by researchers to forecast
1
8/20/2019 Jory Namrata Devi 2012 (1)
9/50
1.1 Forecasting Models
the exchange rates. And the search for a reliable model to predict exchange rates is still
ongoing.
1.1 Forecasting Models
Forecasting foreign exchange rates is a very important issue in the economic world. During
the past years, different models were developed using the linear and nonlinear framework.
A linear model is one which can predict the future values of exchange rates by identifying
and magnifying the existing linear structure in the data. And the most commonly used
linear models for exchange rates forecasting are the Box and Jenkins’ Autoregressive Inte-
grated Moving Average (ARIMA) models. One of the attractive features of Box and Jenkins
approach to forecasting is that there is a very rich class of possible models and it is usually
possible to find a process which provides an adequate description of the data. The ARIMA
model is also a very powerful instrument for construction of accurate forecasts with small
distance of forecasting. Since it is extremely popular, this model is used as a benchmark
to evaluate new modelling approaches. ARIMA models are very effective techniques for
forecasting when the dynamics of the time series is linear and stationary (Cao & Tay 2001).
However, nonlinearities in exchange rates are supported by many evidences. Thus approx-
imation by ARIMA models may not be adequate since it cannot capture nonlinear patterns
in the exchange rates data.
Ever since the inadequacy of linear models (Racine 2001) was observed, there has
been considerable development in modeling nonlinearities. Thus nonlinear models such
as the Autoregressive Conditional Heteroscedasticity (ARCH) model and the Generalized
Autoregressive Conditional Heteroscedasticity (GARCH) model were developed. They
have been found to be useful in capturing certain nonlinear features of financial time series
such as clusters of outliers. (Bollerslev & Ghysels 1996) had shown successful applicationof GARCH model in describing the dynamic behavior of the exchange rates. However
there exists no distinguished theory that can be applied to all such nonlinear models as
they require specific assumptions concerning the precise form of nonlinearity. Moreover
there exist too many possible nonlinear patterns in a particular data set, thus a specific
nonlinear model may not be sufficient in capturing all of them.
In response to all the constraints of the linear and nonlinear models, artificial neural
2
8/20/2019 Jory Namrata Devi 2012 (1)
10/50
1.2 Aims and Organization of the Project
networks (ANNs) have been used to forecast the exchange rates. ANNs resemble and op-
erate in the same way as our biological neural system. Due to their unique non-parametric,
non-assumable, noise-tolerant and adaptive properties (Haoffi & Han 2007), ANNs can
deal better with non-stationary and volatile data. ANNs have flexible nonlinear function
mapping which can approximate any continuous measurable function with arbitrary de-
sired accuracy. They are found to have an upper hand on various traditional linear and
nonlinear models in the exchange rates forecasting. Many researchers such as (Wang
& Leu 1996), (Tang & Fishwich 1993), (T. Hill & Remus 1996) have shown that ANNs
perform better than ARIMA models (linear models), especially for more irregular series
and for multiple-period-ahead forecasting. (Gencay 1999) and (R. K. Bissoondeeal &
Mootanah 2008) find that forecasts generated by neural network are superior to those of
ARIMA and GARCH models. (Panda & Narasimhan 2007) have shown that the perfor-
mance of ANNs is better than the linear autoregressive model and random walk model
for the one-step-ahead prediction, thus suggesting that there always exists a possibility of
forecasting exchange rate. However other studies have reported inconsistent results for
example (W. R. Foster & Ungar 1992) have shown the ANNs are inferior to linear regres-
sion and (Meade 2002) find no evidence that the foreign exchange rate behaviour is better
represented by the ANNs than the linear model.
1.2 Aims and Organization of the Project
This project studies neural network methods for modeling the Mauritian Rupee (MUR)
against the three most important currencies which are the US Dollar, the Euro and the
British Pound. We focus on various neural network models and assess their performance
for in-sample and out-of-sample forecasts and the results are compared against forecasts
produced by ARIMA and GARCH models.The organization of the project is as follows: In Chapter 2, we discuss how the Artificial
Neural Network works and learning algorithm used. Chapter 3 describes linear and non
linear time series models. In Chapter 4, we analyse our data and compare the forecasting
accuracies of the different models.
3
8/20/2019 Jory Namrata Devi 2012 (1)
11/50
Chapter 2
Artificial Neural Networks
2.1 Introduction
Artificial neural networks, which imitate the human brain’s ability to classify patterns and
make prediction based on past experience, have found applications in different areas such
as; financial forecasting, medical diagnostics, flight control and product inspection. The
artificial neural networks have been widely used in applied forecasting due to their ability
to model complex relationships between input and output variables and also because of
the presence of nonlinearities in many time series.
2.2 Feedforward Network
The feedforward neural network is the most commonly used network in applied work
due to its capability of resolving a large number of problems. It consists of a considerable
number of simple processing units known as neurons which are organised in layers. A
feedforward neural network begins with an input layer which is connected to a hidden
layer. The hidden layer can then be connected to another hidden layer or directly to the
output layer. In this architecture data enters at the input layer and passes through the net-
work layer by layer until it arrives at the output layer. The hidden layer is known to be
a very important layer in the neural network since it is responsible to approximate a con-
tinuous function and achieve the desired accuracy. And since a single hidden layer is less
4
8/20/2019 Jory Namrata Devi 2012 (1)
12/50
8/20/2019 Jory Namrata Devi 2012 (1)
13/50
2.2 Feedforward Network
The hidden Layer Vector is Z =(z1, z2, · · · , zk)T
Each hidden neuron has a bias α which is being added to the received weighted sum of all
the inputs to form a net input. The bias may be viewed as simply being added to shift the
function to the left by an amount α. It is much like a weight, except that it has a constant
input of 1.
net input =n
i=1
wi,jxi (2.2)
The net input is the argument of the transfer function f which is applied to construct the
output of a specific neuron.
z j = f
α +
ni=1
wi,jxi
where j = 1, 2, · · · , k (2.3)
In the output layer, the output neuron receives the weighted sum of the processed signals
obtained from the hidden layer neurons. Another function ϕ is applied to produce the
final output.
y = ϕ
β + k
j=1
µ jz j
(2.4)
Where ϕ is the transfer function, β is the bias unit, µ j is the weight from the hidden neuron
j to the output unit.
Replacing z j in the equation 2.4, we get
y = ϕ
β + k
j=1
µ j f
α +
ni=1
wi,jxi
(2.5)
In our feedforward network, a hyperbolic tangent sigmoid transfer function is used in
the hidden layer and a linear transfer function is used in the output layer. This is so be-
6
8/20/2019 Jory Namrata Devi 2012 (1)
14/50
2.3 Backpropagation Learning Algorithm
cause these two transfer function had proved to be successful earlier.
The hyperbolic tangent sigmoid is given by f (x) = tanh(x) and the linear function transfer
function is f (x) = x. A node’s transfer function serves the purpose of controlling the out-
put signal strength for the node. These functions set the output signal strength between 1.0
and −1.0. It was also noticed that the hyperbolic tangent sigmoid function can accelerate
learning for some models and also have an impact on predictive accuracy. The learning
rule commonly used in this type of network is the backpropagation algorithm.
2.3 Backpropagation Learning Algorithm
The process of learning is implemented by modifying the weights iteratively until the de-
sired response is achieved at the output node. Backpropagation is known to be the most
popular supervised learning algorithm. A supervised learning algorithm is one which ac-
cepts input values, computes the output values and compares it with the desired output
values, then adjusts the weights to minimize the deviance. This process is carried out until
the network cannot further reduce the error.
Figure 2.2: The neuron weight adjustment
Backpropagation learning updates the network weights and biases in the direction in
which the performance function decreases the most. The gradient is computed and the
weights are updated after each input in the increment mode. Backpropagagtion starts at
7
8/20/2019 Jory Namrata Devi 2012 (1)
15/50
2.4 Training set and Testing set
the output layer with the following equations:
wij = w
ij + l.e j .xi (2.6)
Where ith input of the j th neuron, wij is the weight, w
ij is the previous weight value, l is
the learning rate, e j is the error term and xi is the ith input.
The backpropagation algorithm looks for the global error from the error function in the
weight space using the method of gradient descent. In gradient descent, weights are
changed in proportion to the negative of an error derivative with respect to each weight. In
this specific algorithm, the network follows the curvature of the error surface with weight
update moving in the direction of the steepest descent. However there is a high probabil-
ity that the network does not reach the global minimum since it may be stuck at the local
minima which does not represent optimal solution.
Momentum is a technique that can help the network out of the local minima. It is an
extension of the backpropagation algorithm which can be helpful in speeding the con-
vergence and avoiding local minima. With momentum, if the weights are moving in a
particular direction, they tend to continue in the same direction. It is also noticed that
momentum smoothes the weight changes. The momentum factor determines the effect of
past changes on current changes of weights and also increases the speed of the learning
rate. The momentum factor which is constantly used has a value very close to 1. The ratio
which influences the speed and quality of learning is called the learning rate. The learning
rate plays a very important role in the learning process of a network, as it controls the size
of the changes in weight at each iteration. Thus the right choice of the learning rate is ex-
tremely important since a too small or a too big change in the size of the weight will affect
the result of the network. A learning rate between 0.05 and 0.5 was found to provide good
results in many practical cases.
2.4 Training set and Testing set
In neural network forecasting, we divide our data into a training set and a test set. The
training set consists of the input data and the target data which need to be presented to the
8
8/20/2019 Jory Namrata Devi 2012 (1)
16/50
2.4 Training set and Testing set
network. However some transfer functions need the input and target data to be scaled so
as they fall within a specified range. Thus in order to meet this requirement we pre-process
our data by normalising the inputs and targets so that they fall in the interval of [−1, 1].
When these input data are presented to the network, the latter makes a guess of the correct
answer and compares it with the target data. The network goes through the data again
and again depending on the number of epochs used, adjust the weight value so as to reach
a value close to the target value. The training set is used to build up the model whereas
the test set which is independent of the training set is used to measure the performance of
the model. More precisely it is used to evaluate the out-of-sample performance. After the
forecasts are obtained from the network we need to convert the data back to the original
scale by the process called post-processing. Moreover, we find from previous studies that
there is no precise rule on the optimum size of the two data set.
9
8/20/2019 Jory Namrata Devi 2012 (1)
17/50
Chapter 3
Linear and Non-linear Time Series
In this chapter we consider the widely used time series models for forecasting. We first
briefly describe the Autoregressive Integrated Moving Average (ARIMA) model which
is a linear model and we then consider the Generalized Autoregressive Conditional Het-
eroscedasticity (GARCH) model as the non-linear model.
3.1 Stationary Processes
A stochastic process X tZ is a family of random variables defined on a probability space.
The joint cumulative distribution function of X t is defined as
F t1,t2,...,tn(x1, x2, . . . , xn) = P (X t1 ≤ x1, X t2 ≤ x2, . . . , X tn ≤ xn) (3.1)
and the process X t is said to be strictly stationary if
F t1+s,t2+s,...,tn+s(x1, x2, . . . , xn) = F t1,t2,...,tn(x1, x2, . . . , xn) (3.2)
for all n-tuples (x1, x2, . . . , xn), (t1, t2, . . . , tn) and for any s The mean function of X t is
given by
µt = E (xt) =
xdF t(x) (3.3)
10
8/20/2019 Jory Namrata Devi 2012 (1)
18/50
3.1 Stationary Processes
and the autocovariance function is given by
γ (t, ) = E [(xt − µt)(xt− − µt−)] (3.4)
The process is said to be weakly stationary if µt = µ for all t and γ (t, ) = γ . In this case we
have γ − = γ .
A white noise process εt is a process such that µ = 0 and γ = σ2 and γ = 0 for = 0. We
denote such a process by εt ∼ WN (0, σ2).
3.1.1 Autoregressive process
The autoregressive process of order p is denoted by AR( p) and it is defined as follows:
X t =
pr=1
φrX t−r + εt (3.5)
where εt ∼ WN (0, σ2) or φ(B)X t = εt, where B is the backshift operator, such that
φ(B) = 1 − φ1B − φ2B
2
− · · · · · · − φ pB
p
(3.6)
The process is invertible since p
r=1 |φr| < ∞ and for the process to be stationary the roots
of φ(B) = 0 must lie outside the unit circle.
3.1.2 Moving Average Process
The moving average process of order q is denoted by MA(q ) and is defined by
X t =q
s=0
θsεt−s (3.7)
or the process can also be written as X t = θ(B)εt where,
θ(B) = 1 − θ1B − θ2B2 − · · · · · · − θqB
q (3.8)
11
8/20/2019 Jory Namrata Devi 2012 (1)
19/50
3.2 ARIMA ( p, d, q ) Processes
Because MA processes consist of a finite sum of stationary white noise terms, they are sta-
tionary hence they have mean zero. The process is invertible when the roots of θ(B) all
exceed unity in absolute value.
3.1.3 Random Walk
In a random walk model at each point of time, the series moves randomly away from its
current position. The model can then be written as
X t = X
t−1 + ε
t (3.9)
We see that the random walk model has the same form as an AR(1) process, but, since
φ = 1, it is not stationary.
Repeatedly substituting for past values gives
X t = X 0 +t−1 j=0
εt− j (3.10)
We find that the first difference of the random walk is stationary, as it is just white noise:
X t −X t−1 = εt (3.11)
3.2 ARIMA ( p, d, q ) Processes
An ARIMA model is a combination of the Autoregressive (AR), differencing and Moving
Average (MA) processes.
If the original process X
t is not stationary, we can look at the first order difference process
Y t = X t −X t−1 (3.12)
or the second order differences and so on.
A general ARIMA model of order ( p, d, q ) can be represented as follows
12
8/20/2019 Jory Namrata Devi 2012 (1)
20/50
3.2 ARIMA ( p, d, q ) Processes
φ(B)∇dX t = θ(B)εt (3.13)
where φ(B) is the AR operators of order p and θ(B) is the MA operators of order q .
X t is the observed value at time t, εt ∼ WN (0, σ2) and d is the number of time the data
series must be differenced to produce a stationary time series.
Fitting an ARIMA model to the raw data involves the following four steps iterative cycles:
model identification, estimation of parameters p, d, q , diagnostic checking and the fore-
casting process.
3.2.1 Autocorrelation and Partial Autocorrelation Functions
The autocorrelation function (ACF) can be used to detect non-randomness in data and also
to identify an appropriate time series model if the data are not random.
Given the data, x1, x2, . . . , xN at time t1, t2, . . . , tN , the lag k autocorrelation function is
defined as:
γ k =
N −ki=1 (xi − x̄)(xi+k − x̄)
ki=1(xi − x̄)
2(3.14)
It is assumed that the observation is equi-spaced thus the time variable t, is not used in the
formula for autocorrelation. The correlation is between two values of the same variable
at times ti and ti+k. When the autocorrelation is used to identify an ARIMA model, the
autocorrelations are plotted for many lags.
The use of the partial autocorrelation function (PACF) was introduced to time series mod-
elling, where one could easily determine the appropriate lags p in AR( p) model or the
extended ARIMA ( p, d, q ) model by just plotting the PACF
The PACF is the conditional correlation,
Correlation(xt, xt+k|xt+1, .......,xt+k−1) (3.15)
Given a time series , the partial autocorrelation of lag k, denoted, is the autocorrelation
between xt and xt+k with the linear dependence of xt+1 through to xt+k−1 removed.
13
8/20/2019 Jory Namrata Devi 2012 (1)
21/50
3.3 GARCH Model
Process ACF PACF
AR( p) Dies down exponentially or sinusoidally cuts off after lag p
MA(q ) Cuts off after lag q Dies down exponentially of sinusoidally
ARMA( p, q ) Dies down exponentially or sinusoidally Dies down exponentially or sinusoidally
Table 3.1: Behaviour of ACF and PACF
3.3 GARCH Model
The GARCH model which is a generalisation of the autoregressive conditional heteroske-
daticity (ARCH) model was introduced by (Bollerslev 1986) and has been used by manyresearchers in modeling financial time series. It has been found that a wide range of fi-
nancial data exhibit time varying volatility clustering which is the property that there are
periods of high and low variance. In response to these, (Engle 1982) suggested the ARCH
model as an alternative to the usual time series process.
3.3.1 ARCH Model
The autoregressive conditional heteroskedasticity (ARCH) model is the very first model of
conditional heteroskedasticity. It is a forecasting model which forecasts the error variance
at time t on the basis of information known at time t − 1 and it is expressed as a moving
average of the past error terms. The following conditional variance defines an ARCH
model of order q :
σ2t = σ0 +
qi=1
αiε2t−i (3.16)
α0 ≥ 0, αi ≥ 0 where αi must be estimated from the data.
The error term will have the form:
εt = σtzt (3.17)
where zt is a sequence of identically and independent distributed random variables with
zero mean and unit variance.
14
8/20/2019 Jory Namrata Devi 2012 (1)
22/50
3.3 GARCH Model
3.3.2 GARCH model
The GARCH model is an extended version of the ARCH model. In this GARCH( p, q )model the variance is a linear function of its own lags and has the form
σ2t = α0 + α1ε2t−1 + · · · + αqε
2t−q + β 1σ
2t−1 + · · · + β pσ
2t− p (3.18)
= α0 +
qi=1
αiε2t−i +
pi=1
β iσ2t−i (3.19)
The rate of decay of the ARCH model is considered to be to be too fast when compared
with the usual financial series, unless the value of q in 3.16, is large. Thus the GARCH
model is prefered since it enables very complicated heteroscedasticity patterns to be mod-
eled at low orders of p and q . The most popular GARCH model in applications has been
the GARCH(1, 1) model, that is, p = q = 1 in 3.18.
15
8/20/2019 Jory Namrata Devi 2012 (1)
23/50
Chapter 4
Analysis of Exchange Rate Data
We study the daily rates of MUR against the US dollar (MUR/USD), the Euro (MUR/EU)
and the British Pound (MUR/GBP). The data sets were obtained from the Bank of Mau-
ritius for the period of January 2003 to December 2011 for MUR/USD and MUR/EU and
for the period of January 2002 to December 2011 for MUR/GBP.
For our analysis, we choose two periods: the first period January 2003 to December 2008
for MUR/USD and MUR/EU and January 2002 to December 2008 for MUR/GBP which
represents the data till the financial crisis and for our second period data we take the orig-
inal data obtained.
The daily returns are calculated as the log differences of the levels. Let xt be a given ex-
change rate time series, then the exchange rate return series yt is given by
yt = ln
xt
xt−1
(4.1)
4.1 Analysis of Period I
Figure 4.1, 4.2 and 4.3 the daily exchange rates data and their corresponding return of the
MUR against USD, EU and GBP:
16
8/20/2019 Jory Namrata Devi 2012 (1)
24/50
4.1 Analysis of Period I
MUR/USD
Time
U S D
2003 2004 2005 2006 2007 2008 2009
2 6
2 8
3 0
3 2
Log differenced of MUR/USD
Time
r e t u r n s
2003 2004 2005 2006 2007 2008 2009
− 0
. 0 1 0
− 0
. 0 0 5
0 . 0
0 0
0 . 0
0 5
0 . 0
1 0
Figure 4.1: Daily MUR/USD data and its corresponding first differences of the logs
MUR/EURO
Time
E U R O
2003 2004 2005 2006 2007 2008 2009
3 0
3 5
4 0
4 5
Log differenced of MUR/EURO
Time
r e t u r n s
2003 2004 2005 2006 2007 2008 2009 −
0 . 0
3
− 0
. 0 2
− 0
. 0 1
0 . 0
0
0 . 0
1
0 . 0
2
0 . 0
3
Figure 4.2: Daily MUR/EU data and its corresponding first differences of the logs
17
8/20/2019 Jory Namrata Devi 2012 (1)
25/50
4.1 Analysis of Period I
MUR/GBP
Time
G B P
2002 2003 2004 2005 2006 2007 2008 2009
4 5
5 0
5 5
6 0
6 5
Log differenced of MUR/GBP
Time
r e t u r n s
2002 2003 2004 2005 2006 2007 2008 2009
− 0
. 0 6
− 0
. 0 4
− 0
. 0 2
0 . 0
0
0 . 0
2
0 . 0
4
Figure 4.3: Daily MUR/GBP data and its corresponding first differences of the logs
In each of the three daily log returns, we observe that there is large random fluctua-
tions. However all the three series appear to be stationary which mean thats the random
variation is constant over time. There is volatility clustering in each of the log return series
since we can see periods of high and low variation.
Mean Median Max Min S.D Skewness Kurtosis
MUR/USD 6.46 e-005 0.0000 0.0119 -0.0101 0.0018 1.0889 11.9486
MUR/EU 2.70 e-004 1.32 e-004 0.0311 -0.279 0.0061 0.0698 5.1469
MUR/GBP 3.80 e-005 0.0000 0.0462 -0.416 0.0058 -0.1038 9.8907
Table 4.1: Summary statistics for the daily exchange rates: log first difference
The standard deviations indicate that the MUR/EU return data is more volatile than
that of the MUR/USD and MUR/GBP return series. The skewness coefficient for MUR/USD
and MUR/EU are positive which indicate that the tail on the right is longer than the left
side. For MUR/GBP the bulk of values lie to the right of the mean. Kurtosis is a measure
of whether the data peaks or flattens relative to a normal distribution. For all the three
18
8/20/2019 Jory Namrata Devi 2012 (1)
26/50
4.1 Analysis of Period I
return series the kurtosis is larger than that of the normal distribution (which is equal to
3), which in turn indicates leptokurtosis. The leptokurtosis indicates that the series are
clustered during certain periods and the volatility changes at a relatively low rate thats is
large changes tend to be followed by large changes and small changes by small changes.
Histogram with Normal Curve
return
F r e q u e n c y
−0.010 −0.005 0.000 0.005 0.010
0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
−0.010 −0.005 0.000 0.005 0.010
0
2 0 0
4 0 0
6 0 0
8 0 0
1 0 0 0
1 2 0 0
kernel density estimation
N = 1500 Bandwidth = 0.0001054
D e n s i t y
Figure 4.4: Histogram and Kernel Density estimate of daily log return of MUR/USD
Histogram with Normal Curve
return
F r e q u e n c y
−0.06 −0.04 −0.02 0.00 0.02 0.04
0
2 0 0
4 0 0
6 0 0
8 0 0
−0.06 −0.04 −0.02 0.00 0.02 0.04
0
2 0
4 0
6 0
8 0
kernel density estimation
N = 1750 Bandwidth = 0.00095
D e n s i t y
Figure 4.5: Histogram and Kernel Density estimate of daily log return of MUR/GBP
Histogram with Normal Curve
return
F r e q u e n c y
−0.03 −0.02 −0.01 0.00 0.01 0.02 0.03
0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
−0.03 −0.02 −0.01 0.00 0.01 0.02 0.03
0
2 0
4 0
6 0
8 0
kernel density estimation
N = 1500 Bandwidth = 0.001075
D e n s i t y
Figure 4.6: Histogram and Kernel Density estimate of daily log return of MUR/EU
19
8/20/2019 Jory Namrata Devi 2012 (1)
27/50
4.1 Analysis of Period I
The plots indicate that the normality assumption is questionable for all the three daily
log return series.
Figure 4.7: Scatter plot of yt against yt−1 for the MUR/USD log return data.
Figure 4.8: Scatter plot of yt against yt−1 for the MUR/EU log return data.
Figure 4.9: Scatter plot of yt against yt−1 for the MUR/GBP log return data.
In the scatter plots shown above the data at time t is plotted against the value at time
t− 1. It is one way of showing the degree of correlation in the data
20
8/20/2019 Jory Namrata Devi 2012 (1)
28/50
4.1 Analysis of Period I
4.1.1 Analysis of Period II data
Mean Median Max Min S.D Skewness Kurtosis
MUR/USD 8.91 e-006 0.0000 0.0122 -0.0101 0.0021 0.7574 7.871
MUR/EU 1.07 e-004 1.04 e-004 0.0311 -0.0279 0.0061 0.0843 5.044
MUR/GBP 2.02 e-005 0.0000 0.0462 -0.0416 0.0060 -0.2152 8.379
Table 4.2: Summary statistics of log first difference daily exchange rate
We find that the data sets for the period of January 2003 to December 2011 for MUR/USD
and MUR/EU and for the period of January 2002 to December 2011 produce the samecharacteristics as described above for the sample data sets. We observe that the minimum
and the maximum value remain the same for the two periods, showing that in the sample
data sets the three series had already attain their maximum and minimum value.
MUR/USD
Time
U S D
2004 2006 2008 2010 2012
2 6
2 8
3 0
3 2
3 4
Figure 4.10: Daily MUR/USD Jan 03 - Dec 11
21
8/20/2019 Jory Namrata Devi 2012 (1)
29/50
4.2 Forecasting Accuracy
MUR/EURO
Time
E U R O
2004 2006 2008 2010 2012
3 0
3 5
4 0
4 5
Figure 4.11: Daily MUR/EU Jan 03 - Dec 11
MUR/GBP
Time
G B P
2002 2004 2006 2008 2010 2012
4 5
5 0
5 5
6 0
6 5
Figure 4.12: Daily MUR/GBP Jan 02 - Dec 11
4.2 Forecasting Accuracy
For comparing forecasts produced by different models, the root mean square error, the
mean absolute error and the mean absolute percentage error are calculated.
et = Y t − Ŷ t
denote the forecast error where Y t is the forecasted value of the observed value Ŷ t.
1. Root Mean Square Error
RMSE =
1n
nt=1
e2t
22
8/20/2019 Jory Namrata Devi 2012 (1)
30/50
4.3 Fitting Neural Network Model to the Exchange Rates Data
2. Mean Absolute Error
MAE = 1n
nt=1
|et|
3. Mean Absolute Percentage Error
MAPE =
1
n
nt=1
|et|
Y t
× 100
The MSE which is the most common measure of forecasting acccuracy indicates the de-gree of spread, however large errors are given additional weight. RMSE is considered,
since the forecast error is then denoted in the same dimensions as the actual and forecast
values themselves. It is found to be most informative for the errors with near normal dis-
tribution. The mean absolute error is a very popular measure of forecast error since it
compares forecast with their eventual outcomes. However it was emphasized by previous
research that the MAPE is the most useful measure to compare the accuracy of forecasts
since it measures relative performance.
4.3 Fitting Neural Network Model to the Exchange Rates Data
The choice of an appropriate architecture is in itself a very difficult task and to add up to it
there is a large number of parameters which must be estimated.
For our Neural Network model we used the feedforward neural network with a single
hidden layer. The number of neurons in the hidden layer was varied between 1 and 5.
The activation function in the hidden layer neuron is Tansigmoid (tansig) and that in the
output layer in the linear function (purelin). To set up the learning rate we ran the net-
work with a large number of different learning rates between 0.05 and 0.5 before settling
on 0.25 which gave us the best results. The momentum value was analyzed and varied
between 0.1 and 0.9 and we found that 0.4 and 0.8 gave us better results. However we
use 0.8 during the experiments since it is common to choose the momentum value close to
1. The training algorithm ’Traingdm’ was used since it provides faster convergence in our
23
8/20/2019 Jory Namrata Devi 2012 (1)
31/50
4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data
feedforward network. The number of epochs used while training was between 30000 and
50000 depending on the performance graph which shows us the point where the network
was sufficiently trained.
After having estimated these parameters, we focus our case study on the following issues
which are firstly the number of input variables and secondly the number of hidden neu-
rons in the hidden layer. All the computations regarding the neural network were carried
out in Matlab.
The number of input variables are based on the number of lagged past observation. The
first input consists of training the data using only the first previous value Lag (1) that is Y t
is the target value thus we used Y t−1 as the input. When we have 2 input variables we use
Lag (1, 2) that is we use Y t−1 and Y t−2 as input and Y t as target. The experiment is carried
out using Lag (1) to Lag (1-12).
We use the daily data of MUR/USD, MUR/EU and MUR/GBP we divide the data into
the training set (In-sample) and testing set (Out-of-sample). The test set for all the data
consists of the values of the whole year of 2008 which we shall try to forecast. For each
input variable set we only experiment with 1-5 hidden nodes.
4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data
ARIMA modeling consists of three stages which are the identification stage, the estimation
and diagnostic checking stage and the forecasting stage.
During the identification stage, we take our data and convert it into time series and find
ACF and PACF. The stationary tests can be performed to determine if differencing is
needed. The analysis of the AFC and PACF graphs usually suggest one or more tenta-tive models that can be fitted in our data.
In the estimation and diagnostic checking stage, the diagnostic statistics help us to judge
the adequacy of the models found in the identification stage. The goodness-of-fit statistics
aid in compairing the model to others. The model with the least Akaike Information Cri-
terion (AIC) and Bayesian information criterion (BIC) is retained.
After inspecting the ACF and PACF for identifying the best model for ARIMA forecasting
24
8/20/2019 Jory Namrata Devi 2012 (1)
32/50
4.5 Fitting the GARCH model
different values of p, d and q are fitted and compared, if any of these model proved to
be better then we fit model for the forecasting stage. Moreover there exists an auto.arima
function in R which can help us to find a fitted model for the data set.
4.5 Fitting the GARCH model
The first step while fitting a GARCH model is to select the parameters of the specific model
however we have check for the ARCH effect in the exchange rate data. From the log return
plots of the MUR/USD, MUR/EU and MUR/GBP we find that there are heavy fluctua-
tions in the data along with the presence of volatility clustering which gives us a hint that
the data may not be identically and independently distributed (iid). The ACF and PACF
for the log return, the absolute log return and the square of the log return is taken and if
they produce significant autocorrelation then the data is not iid and thus we can say that
there exists the ARCH effect.
The ARMA-GARCH model is fitted to the data and the parameters of the equations of the
model is estimated by the ACF, PACF and EACF plot however the parameters values are
varied and different models are fitted to the in-sample data. Both the in-sample and out-
of-sample forecast performance are produced.
25
8/20/2019 Jory Namrata Devi 2012 (1)
33/50
4.6 Empirical findings
4.6 Empirical findings
4.6.1 Forecasts performance of Period I
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(3,1,0) 0.0273 0.0172 0.0586
ARIMA(2,1,3) 0.0270 0.0172 0.0587
AR(1)-GARCH(1,1) 0.0271 0.0172 0.0587
1 50000 2 0.0384 0.0249 0.1000
1-2 50000 2 0.0321 0.0219 0.0744
1-3 50000 2 0.0308 0.0194 0.0658
1-4 50000 4 0.0303 0.0199 0.0673
1-5 50000 5 0.0322 0.0207 0.0706
1-6 50000 3 0.0327 0.0216 0.0735
1-7 50000 4 0.0322 0.0210 0.0712
1-8 50000 2 0.0338 0.0224 0.0760
1-9 50000 3 0.0310 0.0197 0.0672
1-10 50000 4 0.0306 0.0197 0.0670
1-11 50000 2 0.0310 0.0203 0.0689
1-12 50000 1 0.0354 0.0264 0.0894
Table 4.3: In-sample performance of MUR/USD for data Jan 2003-Dec 2008
26
8/20/2019 Jory Namrata Devi 2012 (1)
34/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(1,0,0) 0.1993 0.1553 0.4219
ARIMA(0,1,0) 0.1993 0.1552 0.4214
AR(1)-GARCH(1,2) 0.1990 0.1551 0.4213
1 40000 1 0.2026 0.1595 0.4336
1-2 50000 2 0.2007 0.1569 0.4260
1-3 50000 4 0.2008 0.1579 0.4282
1-4 50000 3 0.2007 0.1573 0.4273
1-5 50000 2 0.1996 0.1558 0.4230
1-6 50000 4 0.1995 0.1559 0.42331-7 50000 4 0.2010 0.1574 0.4269
1-8 40000 1 0.2029 0.1599 0.4343
1-9 50000 4 0.2020 0.1578 0.4278
1-10 50000 5 0.1996 0.1563 0.4235
1-11 50000 4 0.2006 0.1563 0.4320
1-12 50000 5 0.2001 0.1564 0.4243
Table 4.4: In-sample performance of MUR/EU for data Jan 2003-Dec 2008
27
8/20/2019 Jory Namrata Devi 2012 (1)
35/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(1,0,0) 0.2710 0.2032 0.3828
ARIMA(0,1,0) 0.2709 0.2030 0.3823
AR(1)-GARCH(1,1) 0.2700 0.2026 0.3943
1 40000 1 0.2776 0.2097 0.3960
1-2 50000 2 0.2733 0.2050 0.3868
1-3 40000 1 0.2732 0.2052 0.3872
1-4 50000 2 0.2739 0.2052 0.3872
1-5 40000 1 0.2773 0.2096 0.3957
1-6 50000 3 0.2750 0.2067 0.38881-7 50000 2 0.2716 0.2057 0.3879
1-8 50000 2 0.2710 0.2041 0.3848
1-9 50000 5 0.2752 0.2095 0.3945
1-10 50000 3 0.2736 0.2058 0.3865
1-11 50000 4 0.2725 0.2056 0.3865
1-12 50000 2 0.2718 0.2057 0.3874
Table 4.5: In-sample performance of MUR/GBP for data Jan 2002-Dec 2008
28
8/20/2019 Jory Namrata Devi 2012 (1)
36/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(3,1,0) 2.011 1.588 5.331
ARIMA(2,1,3) 2.331 1.615 5.250
AR(1)-GARCH(1,1) 2.161 1.562 5.131
1 50000 2 0.0979 0.0682 0.2338
1-2 50000 2 0.0704 0.0493 0.1687
1-3 30000 1 0.0754 0.0548 0.1871
1-4 50000 2 0.0760 0.0537 0.1835
1-5 30000 1 0.0746 0.0554 0.1896
1-6 30000 1 0.0864 0.0639 0.21881-7 30000 1 0.0843 0.0621 0.2121
1-8 50000 5 0.0786 0.0576 0.1970
1-9 30000 1 0.0791 0.0589 0.2014
1-10 50000 2 0.0815 0.0570 0.1942
1-11 50000 3 0.0864 0.0638 0.2178
1-12 50000 2 0.0784 0.0574 0.1964
Table 4.6: Out-of-sample performance of MUR/USD for data Jan 2003-Dec 2008
29
8/20/2019 Jory Namrata Devi 2012 (1)
37/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(1,0,0) 1.271 0.8503 1.955
ARIMA(0,1,0) 1.119 0.8037 1.868
AR(1)-GARCH(1,2) 1.265 0.8439 1.940
1 30000 1 0.3646 0.2642 0.6196
1-2 50000 2 0.3640 0.2654 0.6229
1-3 30000 1 0.3647 0.2643 0.6200
1-4 30000 1 0.3637 0.2636 0.6182
1-5 50000 2 0.3634 0.2632 0.6174
1-6 30000 1 0.3642 0.2642 0.61971-7 30000 1 0.3650 0.2649 0.6219
1-8 50000 3 0.3631 0.2646 0.6213
1-9 30000 1 0.3650 0.2653 0.6224
1-10 50000 2 0.3638 0.2658 0.6237
1-11 30000 1 0.3638 0.2657 0.6236
1-12 30000 1 0.3631 0.2655 0.6230
Table 4.7: Out-of-sample performance of MUR/EU for data Jan 2003-Dec 2008
30
8/20/2019 Jory Namrata Devi 2012 (1)
38/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(1,0,0) 4.429 3.755 7.244
ARIMA(0,1,0) 4.537 3.853 7.431
ARMA(1,1)-GARCH(1,1) 4.137 3.477 6.708
1 30000 1 0.4796 0.3431 0.6497
1-2 30000 1 0.4772 0.3410 0.6456
1-3 50000 2 0.4772 0.3405 0.6447
1-4 30000 1 0.4794 0.3402 0.6443
1-5 50000 2 0.4776 0.3438 0.6498
1-6 30000 1 0.4845 0.3433 0.65011-7 50000 2 0.4838 0.3477 0.6581
1-8 50000 2 0.4825 0.3450 0.6530
1-9 30000 1 0.4841 0.3431 0.6498
1-10 30000 1 0.4848 0.3435 0.6506
1-11 50000 5 0.4807 0.3430 0.6495
1-12 30000 1 0.4850 0.3439 0.6512
Table 4.8: Out-of-sample performance of MUR/GBP for data Jan 2002-Dec 2008
ANN ARIMA GARCH
Year/daily Actual values Forecasts Errors Forecasts Errors Forecast Errors
2008/1 57.1351 57.4654 -0.3303 57.2970 -0.1619 57.3049 -0.1698
2008/2 57.1221 57.2047 -0.0826 57.2961 -0.174 57.3041 -0.182
2008/3 57.1823 57.1134 0.0689 57.2953 -0.113 57.3032 -0.1209
2008/4 57.3500 57.1434 0.2066 57.2944 0.0556 57.3024 0.0476
2008/5 56.9235 57.253 -0.3295 57.2936 -0.3701 57.3106 -0.3871
2008/6 57.2858 56.8477 0.4381 57.2928 -0.007 57.3007 -0.0149
2008/7 57.2978 57.2938 0.004 57.2919 0.0059 57.2999 -0.0021
2008/8 57.2383 57.1821 0.0562 57.2911 -0.0528 57.2991 -0.0608
2008/9 57.3006 57.2182 0.0824 57.2903 0.0103 57.2982 0.0024
2008/10 57.4102 57.235 0.1752 57.2894 0.1208 57.2974 0.1128
Table 4.9: First 10 forecasts values of MUR/GBP for ANN, ARIMA and GARCH models
31
8/20/2019 Jory Namrata Devi 2012 (1)
39/50
4.6 Empirical findings
4.6.2 Results and Discussions for Period I
The tables 4.3 to 4.8 show the minimum error obtained when we vary the number of neu-rons (between 1 − 5) in the hidden layer for each Lag (1) to Lag (1 − 12). We observe from
our data that as the number of neurons in the hidden layer increases the amount of epochs
to train the network must also increase so as to reach a point where the forecast value does
not change anymore. Thus increasing the number of epochs when the network has already
reached the global minima is useless and very time consuming. We reach the conclusion
that the number of epochs used is dependent to the number of neurons in the hidden layer.
Moreover we also notice that as the number of neurons increases the performance of the
network changes. We tend to achieve the best forecasts when we have 2 or 3 neurons inthe hidden layer and as the number of neurons increases from 5 the forecast values tend
to move away from the target values since great increase in the RMSE, MAE and MAPE
values are observed.
In the tables above the in-sample performance are presented followed by the out-of-sample
performance. Here the In-sample data set for MUR/USD and MUR/EU is taken from Jan-
uary 2003-December 2007 and for MUR/GBP is taken from January 2002-December 2007.
The out-of-sample data set in all the three cases is taken for January 2008-December 2008.
As for the number of inputs presented to the network, we cannot find any trend about how
it affects the performance. In our experiment for both the in-sample and out-of-sample
forecast we observe that each data set has a different number of inputs where the neural
network works best.
Considering the in-sample forecast we find that for MUR/USD the ARIMA(2, 1, 3) model
outperforms the ANN models however the AR(1)-GARCH(1, 1) model gives almost the
same forecast accuracy. As for the MUR/EU and MUR/GBP model the GARCH models
perform much better than the other models. Thus we can conclude that for in-sample fore-
casts the GARCH model produces better forecasts than the ARIMA and the ANN models.
For out-of-sample forecasts the GARCH models used for both MUR/USD and MUR/GBP
gives a superior results than the ARIMA models when the RMSE, MAE and MAPE are
considered as evalution criteria. However when compared with the ANN model the fore-
cast performance is found to be far behind.
The table 4.9 show that first 10 out-of-sample forecasts of MUR/GBP for ANN, ARIMA
and the GARCH models. We notice that the three models perform very accurately for the
32
8/20/2019 Jory Namrata Devi 2012 (1)
40/50
4.6 Empirical findings
first 10 forecasts and since the MAPE values for the forecasts of year 2008 of the ARIMA
and GARCH models are quite inferior to the ANN model value we can say that as the
ANN continue to predict more values it out-perform the other two models.
The diagrams 4.13, 4.14 and 4.15 shows the In-sample and Out-of-sample forecasts
produced by the ANN models:
Figure 4.13: In-sample and out-of-sample forecast for MUR/USD
Figure 4.14: In-sample and out-of-sample forecast for MUR/EU
33
8/20/2019 Jory Namrata Devi 2012 (1)
41/50
4.6 Empirical findings
Figure 4.15: In-sample and out-of-sample forecast for MUR/GBP
4.6.3 Forecasts performance of Period II
Lags No of Epochs No of Neurons RMSE MAE MAPE
ARIMA(5,1,0) 0.0489 0.0313 0.1025
ARIMA(2,1,3) 0.0489 0.0314 0.1026
AR(1)-GARCH(1,1) 0.0049 0.0314 0.1031
1 50000 1 0.0671 0.0454 0.15051-2 50000 2 0.0547 0.0368 0.1214
1-3 50000 2 0.0551 0.0364 0.1200
1-4 50000 2 0.0537 0.0363 0.1198
1-5 50000 3 0.0544 0.0374 0.1236
1-6 40000 1 0.0562 0.0396 0.1312
1-7 50000 2 0.0536 0.0363 0.1200
1-8 50000 2 0.0534 0.0359 0.1186
1-9 40000 1 0.0561 0.0392 0.12991-10 50000 2 0.0544 0.0362 0.1193
1-11 50000 3 0.0581 0.0389 0.1284
1-12 50000 5 0.0536 0.0357 0.1177
Table 4.10: In-sample performance MUR/USD data
34
8/20/2019 Jory Namrata Devi 2012 (1)
42/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(1,0,0) 0.2445 0.1806 0.4571
ARIMA(0,1,0) 0.2445 0.1805 0.4568
AR(1)-GARCH(1,1) 0.2245 0.1806 0.4571
1 50000 2 0.2442 0.1817 0.4587
1-2 50000 3 0.2445 0.1818 0.4591
1-3 50000 2 0.2453 0.1833 0.4629
1-4 50000 3 0.2461 0.1841 0.4646
1-5 50000 2 0.2436 0.1807 0.4560
1-6 50000 2 0.2448 0.1815 0.45811-7 50000 5 0.2437 0.1813 0.4575
1-8 50000 4 0.2431 0.1805 0.4553
1-9 50000 3 0.2445 0.1820 0.4589
1-10 50000 2 0.2439 0.1811 0.4565
1-11 50000 5 0.2442 0.1809 0.4560
1-12 50000 2 0.2452 0.1826 0.4611
Table 4.11: In-sample performance MUR/EU data
35
8/20/2019 Jory Namrata Devi 2012 (1)
43/50
4.6 Empirical findings
Lags No of epochs No of neurons RMSE MAE MAPE
ARIMA(1,0,0) 0.3181 0.2318 0.4443
ARIMA(0,1,0) 0.3181 0.2318 0.4441
AR(1)-GARCH(1,1) 0.2704 0.2026 0.3942
1 50000 2 0.3120 0.2270 0.4383
1-2 50000 2 0.3118 0.2273 0.4358
1-3 30000 1 0.3098 0.2262 0.4263
1-4 50000 3 0.3113 0.2284 0.4405
1-5 50000 4 0.3088 0.2260 0.4363
1-6 50000 2 0.3115 0.2287 0.44101-7 50000 4 0.3110 0.2275 0.4391
1-8 50000 2 0.3092 0.2251 0.4392
1-9 50000 2 0.3143 0.2316 0.4477
1-10 50000 3 0.3113 0.2278 0.4396
1-11 50000 4 0.3105 0.2278 0.4398
1-12 50000 2 0.3098 0.2271 0.4381
Table 4.12: In-sample performance MUR/GBP data
36
8/20/2019 Jory Namrata Devi 2012 (1)
44/50
4.6 Empirical findings
Lags No of Epochs No of Neurons RMSE MAE MAPE
ARIMA(5,1,0) 1.234 1.220 4.082
ARIMA(2,1,3) 1.233 1.220 4.080
AR(1)-GARCH(1,2) 1.233 1.221 4.082
1 30000 1 0.0501 0.0356 0.1189
1-2 30000 2 0.0539 0.0422 0.1441
1-3 30000 1 0.0502 0.0404 0.1349
1-4 30000 1 0.0494 0.0394 0.1317
1-5 50000 2 0.0495 0.0386 0.1291
1-6 30000 1 0.0501 0.0387 0.12951-7 50000 2 0.0512 0.0410 0.1371
1-8 30000 1 0.0505 0.0396 0.1324
1-9 50000 2 0.0512 0.0406 0.1357
1-10 30000 1 0.0472 0.0368 0.1229
1-11 30000 1 0.0518 0.0405 0.1352
1-12 30000 1 0.0222 0.0166 0.0553
Table 4.13: Out-of-sample performance MUR/USD data
37
8/20/2019 Jory Namrata Devi 2012 (1)
45/50
4.6 Empirical findings
Lags No of Epochs No of Neurons RMSE MAE MAPE
ARIMA(1,0,0) 1.772 1.735 4.408
ARIMA(0,1,0) 1.780 1.742 4.426
AR(1)-GARCH(1,1) 1.776 1.739 4.419
1 30000 1 0.1427 0.1099 0.2783
1-2 30000 2 0.1424 0.1103 0.2791
1-3 50000 4 0.1381 0.1047 0.2645
1-4 30000 1 0.1424 0.1102 0.2790
1-5 30000 1 0.1423 0.1107 0.2802
1-6 30000 1 0.1437 0.1122 0.28411-7 50000 2 0.1433 0.1120 0.2835
1-8 30000 1 0.1452 0.1147 0.2904
1-9 50000 2 0.1590 0.1237 0.3133
1-10 50000 2 0.1470 0.1105 0.2798
1-11 30000 1 0.1410 0.1107 0.2803
1-11 50000 2 0.1394 0.1170 0.2961
1-12 30000 1 0.1389 0.1079 0.2730
Table 4.14: Out-of-sample performance of MUR/EU data
38
8/20/2019 Jory Namrata Devi 2012 (1)
46/50
4.6 Empirical findings
Lags No of Epochs No of Neurons RMSE MAE MAPE
ARIMA(1,0,0) 1.993 1.981 4.249
ARIMA(0,1,0) 1.934 1.921 4.119
AR(1)GARCH(1,2) 1.996 1.985 4.257
1 30000 1 0.2396 0.1807 0.3694
1-2 30000 1 0.2440 0.1831 0.3745
1-3 50000 3 0.1985 0.1624 0.3332
1-4 30000 1 0.2048 0.1648 0.3381
1-5 50000 5 0.1990 0.1463 0.3003
1-6 30000 1 0.1998 0.1470 0.30071-7 50000 3 0.1994 0.1506 0.3089
1-8 50000 3 0.2025 0.1593 0.3270
1-9 50000 4 0.1955 0.1535 0.3149
1-10 30000 1 0.2141 0.1689 0.3466
1-11 50000 2 0.2142 0.1711 0.3511
1-12 50000 2 0.2029 0.1506 0.3087
Table 4.15: Out-of-sample performance of MUR/GBP data
4.6.4 Results and Discussions for Period II
In this experiment the in-sample set for MUR/USD and MUR/EU consists the values from
January 2003 to November 2011 whereas the in-sample set for MUR/GBP consists the val-
ues from January 2002 to November 2011. The out-of-sample set for each of the three series
consists the value of December 2011.
Considering the in-sample forecasts of the MUR/USD data we find that the ARIMA model
outperform all the other models. As for the MUR/EU data the ANN model using the
Lag(1 − 8) produces a superior results since the RMSE, MAE and MAPE values are small
compare to the random walk, ARIMA and GARCH model. In the MUR/GBP forecasts the
GARCH model used produces better forecasts.
For the out-of-sample forecast we find that the ANN models outperform the ARIMA,
GARCH and random walk model for all the three series.
39
8/20/2019 Jory Namrata Devi 2012 (1)
47/50
4.6 Empirical findings
Period I Period II
ANN ARIMA GARCH ANN ARIMA GARCH
RMSE 0.0308 0.0273 0.0271 0.0536 0.0489 0.0499
In-sample MAE 0.0194 0.0172 0.0172 0.0357 0.0313 0.0314
MAPE 0.0658 0 .0586 0.0587 0.1177 0 .1025 0.1031
RMSE 0.0704 2.011 2.161 0.0222 1.233 1.233
Out-of-sample MAE 0.0493 1.588 1.562 0.0166 1.220 1.221
MAPE 0.1687 2.331 5.131 0.0553 4.080 4.082
Table 4.16: In-sample and Out-of-sample forecasts of MUR/USD
Period I Period II
ANN ARIMA GARCH ANN ARIMA GARCH
RMSE 0.1996 0.1993 0.1990 0.2436 0.2445 0.2245
In-sample MAE 0.1563 0.1553 0.1551 0.1807 0.1806 0.1806
MAPE 0.4235 0 .4219 0.4213 0.4560 0 .4571 0.4571
RMSE 0.3634 1.271 1.265 0.1389 1.772 1.776
Out-of-sample MAE 0.2632 0.8503 0.8439 0.1079 1.735 1.739
MAPE 0.6174 1.955 1.940 0.2730 4.408 4.419
Table 4.17: In-sample and Out-of-sample forecasts of MUR/EU
Period I Period II
ANN ARIMA GARCH ANN ARIMA GARCH
RMSE 0.2710 0.2710 0.2700 0.3098 0.3181 0.2704
In-sample MAE 0.2041 0.2032 0.2026 0.2262 0.2318 0.2026
MAPE 0.3848 0 .3828 0.3843 0.4263 0 .4443 0.3942
RMSE 0.4794 4.429 4.137 0.1990 1.993 1.996
Out-of-sample MAE 0.3402 3.755 3.477 0.1463 1.981 1.985
MAPE 0.6443 7.244 6.708 0.3003 4.249 4.257
Table 4.18: In-sample and Out-of-sample forecasts of MUR/GBP
40
8/20/2019 Jory Namrata Devi 2012 (1)
48/50
Chapter 5
Conclusion
In this project, linear and nonlinear models were used to forecast the exchange rates of the
Mauritius Rupee against three foreign currencies: the US Dollar, the Euro and the British
Pound. Accuracy of forecasting models for two periods of data first from January 2002
to December 2008 and the second from January 2002 to December 2011 were taken into
consideration. The reason for this choice is that we wanted to test the ability of our fore-
casting procedures to provide accurate out-of-sample forecasts during the financial crisis.
The artificial neural network (ANN) which is a nonlinear model was used as an alternative
model to the linear ARIMA processes and the nonlinear GARCH models. The empirical
results show that the ANN models has a superior out-of-sample forecasting performance
for both periods data when compared to the other models. For the in-sample forecasts we
observed that ARIMA and ARMA-GARCH models provided better goodness-of-fit than
the ANN models.
One of the reasons behind this is that there is still no well defined guidelines to build up an
ANN model to solve a specific problem. Thus to obtain the best possible ANN forecasting
model rigorous experiments were carried out so as to determine the different parametersto build up the model. Considering the out-of-sample performance of the ANN models
we can say that they proved to be successful when fitted to the foreign exchange rates data
provided extreme care is taken in designing the network. Thus we can conclude that the
ANN model can be used as a complementary tool to different time-series models and the
forecast accuracies can be further improved.
41
8/20/2019 Jory Namrata Devi 2012 (1)
49/50
Bibliography
Bollerslev, T. (1986), ‘Generalized autoregressive conditional heteroscedasticity’, Journal of
economics 31, 307–327.
Bollerslev, T. & Ghysels, E. (1996), ‘Periodic autoregressive conditional heteroscedasticity’,
J.Bus.Eco.Stat 14, 139–151.
Cao, L. & Tay, F. (2001), ‘Financial forecasting using support vector machines’, Neural Com-
puter and Application 10, 184–192.
Engle, R. F. (1982), ‘Autoregressive conditional heteroscedasticity with estimates of the
variance of united kingdom inflation’, Economica 50, 909–927.
Gencay, R. (1999), ‘Linear, non-linear and essential foreign exchange rate prediction withsimple technical trading rules’, Journal of International Economics 47, 91–107.
Haoffi, Z., X. G. Y. F. & Han, Y. (2007), ‘A neural network model based on the multistage
optimization approach for short term food price forecasting in china’, Expert Syst. Appl.
33, 347–356.
Meade, N. (2002), ‘A comparison of the accuracy of short term foreign exchange forecasting
methods’, International Journal of Forecasting 18, 67–83.
Panda, C. & Narasimhan, V. (2007), ‘Forecasting exchange rate better with artificial neuralnetwork’, Journal of Policy Modeling 29, 227–236.
R. K. Bissoondeeal, J. M. Binner, M. B. A. G. & Mootanah, V. P. (2008), ‘Forecasting exchange
rates with linear and nonlinear models’, Global Business and Economics Review 10, 414–
429.
Racine, J. (2001), ‘On the nonlinear predictability of stock returns using financial and eco-
nomic variables’, Business Econ.Stat. 19, 380–382.
42
8/20/2019 Jory Namrata Devi 2012 (1)
50/50
BIBLIOGRAPHY
T. Hill, M. O. & Remus, W. (1996), ‘Neural network models for time series forecasts’, Man-
agement Science 42, 1082–1092.
Tang, Z. & Fishwich, P. A. (1993), ‘Backpropagation neural nets as models for time series
forecasting’, ORSA Journal on Computing 5, 374–385.
W. R. Foster, F. C. & Ungar, L. H. (1992), ‘Neural network forecasting of short, noisy time
series’, Computers and Chemical Engineering 16, 293–297.
Wang, J. H. & Leu, J. Y. (1996), ‘Stock market trend prediction using arima-based neural
networks’, Proc.of IEEE Int.Conf.on Neural Networks 4, 2160–2165.
Yaser, S. & Atiya, A. (1996), ‘Introduction to financial forecasting’, Applied Intelligence6, 205–213.