Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
NOVEL HYBRID ELECTRIC LOAD FORECASTING
MODEL USING ARIMA MODEL AND DISCRETE
WAVELET TRANSFORM
THESIS
Submitted
in fulfilment of the requirements of the degree of
DOCTOR OF PHILOSOPHY
By
Harveen Kaur
1410931001
Supervised by
Dr. Sachin Ahuja
Professor & Director | Research
Department of Computer Science and Engineering
CHITKARA UNIVERSITY
CHANDIGARH-PATIALA NATIONAL HIGHWAY
RAJPURA (PATIALA) PUNJAB-140401 (INDIA)
February 2021
i
DECLARATION BY STUDENT
I hereby certify that the work which is being presented in this thesis entitled
“Novel hybrid electric load forecasting model using ARIMA model and
Discrete Wavelet Transform” is for fulfilment of the requirement for the
award of Degree of Doctor of Philosophy submitted in the Department of
Computer Science and Engineering, Chitkara University, Punjab
The work has not formed the basis for the award of any other degree or
diploma, in this or any other Institution or University. In keeping with the
ethical practice in reporting scientific information, due acknowledgements have
been made wherever the findings of others have been cited.
Harveen Kaur
ii
CERTIFICATE BY SUPERVISOR
This is to certify that the thesis entitled “Novel hybrid electric load
forecasting model using ARIMA model and Discrete Wavelet Transform”
submitted by Harveen Kaur to the Chitkara University, Punjab in fulfilment
for the award of the degree of Doctor of Philosophy is a bona fide record of
research work carried out by her under my supervision. The contents of this
thesis, in full or in parts, have not been submitted to any other Institution or
University for the award of any degree or diploma.
Dr. Sachin Ahuja
Professor & Director | Research
Chitkara University, Punjab
iii
ACKNOWLEDGEMENT
I wish to thank all of those who made contributions in a variety of forms over the prolonged
period of researching and writing this thesis.
It gives me an immense pleasure in expressing my deepest gratitude towards my research
supervisor Dr. Sachin Ahuja, Professor and Director (Research), Chitkara University,
Punjab, India for his warm encouragement and thoughtful guidance. I am thankful to my
supervisor for all his contributions of time and ideas to make my Ph.D experience productive
and stimulating.
I am highly indebted to Mr. Arun Kumar Gupta, E-in-Chief, Planning, PSPCL, Patiala and
his associated staff for arranging and providing me the real energy consumption data of the
Punjab State and providing valuable ideas in undergoing my research work.
My special thanks to the internal and external examiners who provided valuable fresh insight
into the work and helped to make it more conclusive research.
My deepest gratitude goes to my father Mr. Surinder Pal Singh, Executive Engineer, Office
of Chief Electrical Inspector, Punjab Govt., who has been constantly supporting me for
making my research work more effective and my mother Mrs. Palvinder Kaur for her
unflagging love and being the constant source of inspiration and encouragement throughout
my personal journey that is part of this long research process. Ashmeet Kaur, my sister
whose encouragement, and constant support for me to reach this goal has been the greatest
gift.
Above all, I am grateful to God, Almighty who sustains this beautiful world and without
whose grace nobody can ever succeed.
HARVEEN KAUR
iv
ABSTRACT
Energy is the first and foremost part of the socio-economic and political world in which we
live. The most important of the various forms of energy is electricity. There is always a gap
between the supply and demand for the electric energy. To meet the ever increasing demand
of the electricity consumption there is a dire need for an accurate prediction model that can
prove useful. In the present work, electricity consumption forecasting model is designed for
the State of Punjab, India, in which the dynamic relationship among the time series entities is
explored. The time-series data comprises a variety of information in their samples consisting
of both linear and nonlinear data. Based on the type of Time Series Data, models such as
linear and nonlinear can be applied. In this work, the direct model techniques used are Auto-
Regressive Integrated Moving Average (ARIMA), whereas nonlinear model techniques used
are optimization algorithms, i.e., Cuckoo Search (CS), and Artificial Neural Network (ANN).
To achieve optimum accuracy, instead of using these techniques individually, a hybrid model
has been developed which was further applied on the Time Series Data of electricity
consumption in Punjab. Initially, the data is decomposed into two levels using the Discrete
Wavelet Transform (DWT) method. DWT is used to decompose the Punjab State Power
Corporation Limited (PSPCL) data into two parts based on electricity consumption, which
helps to determine the highest and the lowest electricity consumption. On each decomposed
data ARIMA technique is applied individually to obtain a Time Series Data. Then, Inverse
Discrete Wavelet Transform (IDWT) is applied to combine the data, which is further
optimized using a nature-inspired CSA technique. The Artificial Neural Network (ANN)
algorithm is used to train the designed model by passing the optimized data to its input layers,
which helps for the prediction of electricity consumption in the future. After applying
ARIMA, the accuracy of the forecasting model is 83.53% and ARIMA with DWT technique
there is an increase of 10.23% in accuracy whereas utilizing ARIMA with the hybrid model
of DWT, CSA and ANN there is an increase of 15.33 % in accuracy. So, the performance of
the proposed hybrid strategy, i.e., ARIMA, DWT, CSA and ANN is better than both existing
forecasting models i.e. ARIMA model and ARIMA with DWT model.
v
SUMMARY
This complete research is comprised of 6 chapters, which by large emphasis the working
functionality and gives a complete explanation of the research work and implementation.
Chapter 1: The introduction chapter consists of the basic terminology of research topic by
highlighting the concepts and the process of enhancing the methods of implementation for the
research work on Novel hybrid electric load forecasting model using ARIMA model and
Discrete Wavelet Transform.
Chapter 2: The literature review chapter provides the base of knowledge on the topic and
helps to identify gaps in research and includes the study by the other authors who have done
the work on same areas.
Chapter 3: The methodology chapter provides the proposed hybrid forecasting model. In this
chapter the proposed methodology is designed and developed.
Chapter 4: The Proposed work chapter comprises of the algorithms used to develop the
proposed hybrid forecasting model. The specific objectives are as follows:
1. To model the time series data and forecast future values using ARIMA.
2. To determine how discrete wavelet transform, resolve the difficulties in ARIMA
modeling.
3. To develop a hybrid model of ARIMA and wavelet transform.
4. To analyze the performance and accuracy of the proposed algorithm.
Chapter 5: This chapter discusses about results and the simulation of the research work done
using MATLAB simulator to forecast the electricity consumption by using different
techniques.
Chapter 6: This chapter is about conclusion and future scope. The research focuses on
design and development of a novel intelligent technique, which can be used to study the
behaviour of electricity consumption on the basis of time series data.
vi
TABLE OF CONTENTS
DECLARATION BY STUDENT ............................................................................................ i
CERTIFICATE BY SUPERVISOR ........................................................................................ ii
ACKNOWLEDGEMENT ..................................................................................................... iii
ABSTRACT .............................................................................................................................. iv
SUMMARY ............................................................................................................................... v
TABLE OF CONTENTS ......................................................................................................... vi
LIST OF FIGURES ................................................................................................................... x
LIST OF TABLES ................................................................................................................... xv
ABBREVIATIONS .............................................................................................................. xvii
Chapter 1: INTRODUCTION ................................................................................................ 1
1.1 BACKGROUND OF THE RESEARCH ......................................................................... 1
1.2 TIME SERIES DATA ...................................................................................................... 2
1.2.1 Linear vs. Non-linear ................................................................................................ 2
1.2.2 Periodic vs. Non-periodic ......................................................................................... 4
1.2.3 Gaussian and Non-Gaussian ..................................................................................... 5
1.2.4 Low Volatile and High volatile................................................................................. 6
1.2.5 Advantages of Electricity prediction ........................................................................ 9
1.3 TIME SERIES DATA PREDICTION MODELS .......................................................... 10
1.3.1 Auto-Regressive Integrated Moving Average (ARIMA) Model ............................ 10
1.3.2 Cuckoo Search Algorithm....................................................................................... 11
1.3.3 Cuckoos behavior for egg-laying ............................................................................ 12
1.3.4 Artificial Neural Networks (ANNs)........................................................................ 16
1.4 DECOMPOSITION BASED PREDICTION MODELS................................................ 19
vii
1.4.1 Decomposition based on Moving average (MA) .................................................... 20
1.4.2 Discrete Wavelet Transform (DWT) ...................................................................... 21
1.4.3 Trend-ARIMA model ............................................................................................. 22
1.4.4 Wavelet-ARIMA model.......................................................................................... 22
1.5 CLASSIFICATION TECHNIQUES .............................................................................. 24
1.5.1 Supervised classification approach ......................................................................... 24
1.5.2 Unsupervised learning ............................................................................................ 25
1.5.3 Semi-supervised classification approach ................................................................ 26
1.6 PROBLEM STATEMENT ............................................................................................. 26
1.7 RESEARCH OBJECTIVES ........................................................................................... 27
1.8 RESEARCH GAPS ........................................................................................................ 28
Chapter 2: PRESENT STATE OF ART ............................................................................... 29
2.1 HISTORY OF ELECTRICITY SUPPLY ...................................................................... 29
2.2 PUNJAB STATE POWER CORPORATION LIMITED (PSPCL) ............................... 30
2.3 OVERVIEW ON LOAD FORECASTING .................................................................... 30
2.3.1 Type of Load Forecasting Technique ..................................................................... 31
2.4 CLASSIFICATION OF LOAD FORECASTING ......................................................... 36
2.4.1 Trending methods ................................................................................................... 37
2.5 LOAD FORECASTING METHODS............................................................................. 37
2.5.1 Literature review of studies using ARIMA or its Hybrid models for forcasting .... 39
2.5.2 ANN and ARIMA ................................................................................................... 41
2.5.3 Wavelet Decomposition based ARIMA and ANN ................................................. 45
Chapter 3: METHODOLOGY .............................................................................................. 47
3.1 METHODOLOGY OF THE PROPOSED HYBRID FORECASTING MODEL ......... 47
3.2 PROGRAMMING LANGUAGE ................................................................................... 49
Chapter 4: PROPOSED WORK ........................................................................................... 50
viii
4.1 ALGORITHMS USED TO DEVELOP THE PROPOSED HYBRID FORECASTING
MODEL ................................................................................................................................ 50
4.1.1 ARIMA model ........................................................................................................ 50
4.1.2 DISCRETE WAVELET TRANSFORM (DWT) ................................................... 52
4.1.3 CUCKOO SEARCH OPTIMIZATION ALGORITHM ........................................ 54
4.1.4 ARTIFICIAL NEURAL NETWORK .................................................................... 56
4.2 TO MODEL THE TIME SERIES DATA AND FORECAST FUTURE VALUES
USING ARIMA .................................................................................................................... 58
4.3 TO DETERMINE HOW DISCRETE WAVELET TRANSFORM RESOLVES THE
DIFFICULTIES IN ARIMA MODELING .......................................................................... 61
4.3.1 Difficulties that were resolved by combining ARIMA model with DWT ............. 61
4.3.2 Use of DWT in proposed hybrid model .................................................................. 63
4.3.3 Use of ARIMA model in proposed work ................................................................ 64
4.3.4 DWT and ARIMA Model ....................................................................................... 65
4.3.5 HAAR Wavelet ....................................................................................................... 67
4.3.6 The requirement of DWT in time series forecasting .............................................. 68
4.4 TO DEVELOP A HYBRID MODEL OF ARIMA AND WAVELET TRANSFORM . 71
4.5 TO ANALYZE THE PERFORMANCE AND ACCURACY OF THE PROPOSED
ALGORITHM....................................................................................................................... 72
Chapter 5: RESULTS AND ANALYSIS ............................................................................. 74
5.1 DATASET USED ........................................................................................................... 75
5.2 RESULTS AND DISCUSSION ..................................................................................... 77
5.2.1 Prediction using ARIMA Model ........................................................................... 114
5.2.2 Prediction using ARIMA with DWT .................................................................... 120
5.2.3 Prediction using the Proposed Hybrid Model ....................................................... 125
5.2.4 Computed parameters ........................................................................................... 134
Chapter 6: CONCLUSION AND FUTURE SCOPE....................................................... 138
ix
6.1 CONCLUSION ............................................................................................................. 138
6.2 FUTURE SCOPE.......................................................................................................... 140
RESEARCH PUBLICATIONS ........................................................................................... 141
REFERENCES ....................................................................................................................... 142
x
LIST OF FIGURES
Figure 1:1 Child's Height vs. Age (Linear Time Series Data) [5] ............................................. 3
Figure 1:2 Sea Temperature vs. year (non-linear) ..................................................................... 3
Figure 1:3 An example of Periodic TSD [7] .............................................................................. 4
Figure 1:4 Stock Market Prediction as non-periodic Time Series Data .................................... 5
Figure 1:5 Gaussian distribution example ................................................................................. 5
Figure 1:6 Non- Gaussian Distribution example ....................................................................... 6
Figure 1:7 Low Volatile Time Series Data ................................................................................ 6
Figure 1:8 Highly volatile Time Series Data ............................................................................. 7
Figure 1:9 Cuckoo bird ............................................................................................................ 12
Figure 1:10 Representation of a nest solution in the Cuckoo search algorithm ...................... 13
Figure 1:11 Flow chart of Cuckoo search optimization algorithm [21] .................................. 14
Figure 1:12 Model of ANN [25] .............................................................................................. 17
Figure 1:13 layered architecture of Feedforward network [26] ............................................... 18
Figure 1:14 Feedback network [26] ......................................................................................... 19
Figure 1:15 Time Series decomposition using MA Filter [29] ................................................ 20
Figure 1:16 Wavelet decomposition [33] ................................................................................ 23
Figure 1:17 Supervised learning [36] ....................................................................................... 24
Figure 1:18 Supervised learning process [37] ......................................................................... 25
Figure 1:19 Unsupervised learning [38] .................................................................................. 25
Figure 2:1 Spatial electric load forecasting methods ............................................................... 36
Figure 3.3:1 Proposed Work .................................................................................................... 48
Figure 4.4:1 Horizontal Wavelet transforms [93] .................................................................... 53
Figure 4. 4:2 Vertical Wavelet transforms for horizontal wavelets [93] ................................. 53
Figure 4. 4:3 1, 2 and 3-level Discrete Wavelet Decompositions [93] ................................... 54
xi
Figure 4.4:4 (a) Stationary and (b) Non-stationary series........................................................ 59
Figure 4.4:5 DWT and ARIMA Model ................................................................................... 65
Figure 4.4:6 Discrete Wavelet Transform Output ................................................................... 66
Figure 5:1 Dataset consumed electricity from January 2013 to December 2017 .................... 74
Figure 5:2 Data panel ............................................................................................................... 78
Figure 5:3 Upload Data ............................................................................................................ 79
Figure 5:4 Convert to stationary .............................................................................................. 80
Figure 5:5 Generated hypothesis ............................................................................................. 81
Figure 5:6 Original consumed and predicted electricity using ARIMA .................................. 82
Figure 5:7 Next Predicted electricity using ARIMA ............................................................... 83
Figure 5:8 Data panel of DWT - ARIMA ................................................................................ 83
Figure 5:9 Data uploading again .............................................................................................. 84
Figure 5:10 Decomposition of data using Haar wavelet .......................................................... 85
Figure 5:11 Decomposed (LL) data converted into stationary ................................................ 86
Figure 5:12 Generated hypothesis of LL decomposed datasets............................................... 87
Figure 5:13 Original consumed and predicted electricity using LL ARIMA .......................... 88
Figure 5:14 Decomposed (LH) data converted into stationary ................................................ 88
Figure 5:15 Generated hypothesis of LH decomposed datasets .............................................. 89
Figure 5:16 Original consumed and predicted electricity using LH ARIMA ......................... 90
Figure 5:17 Decomposed (HL) data converted into stationary ................................................ 90
Figure 5:18 Generated hypothesis of HL decomposed datasets .............................................. 91
Figure 5:19 Original consumed and predicted electricity using HL ARIMA ......................... 92
Figure 5:20 Decomposed (HH) data converted into stationary ............................................... 92
Figure 5:21 Generated hypothesis of HL decomposed datasets .............................................. 93
Figure 5:22 Original consumed and predicted electricity using HH ARIMA ......................... 94
Figure 5:23 Inverse-DWT ........................................................................................................ 94
xii
Figure 5:24 Data Panel for Daubechies Wavelet ..................................................................... 95
Figure 5:25 Segmentation done using Daubechies Wavelet Transform .................................. 96
Figure 5:26 Differencing Applied on non-stationary data using Daubechies for LL .............. 96
Figure 5:27 Differencing Applied on non-stationary data using Daubechies
for LH....................................................................................................................................... 97
Figure 5:28 Differencing Applied on non-stationary data using Daubechies for
HL ............................................................................................................................................ 97
Figure 5:29 Differencing Applied on non-stationary data using Daubechies for HH ............. 98
Figure 5:30 LL-Autocorrelation plot for Daubechies .............................................................. 99
Figure 5:31 LH-Autocorrelation plot for Daubechies ........................................................... 100
Figure 5:32 HL-Autocorrelation plot for Daubechies ........................................................... 101
Figure 5:33 HH-Autocorrelation plot for Daubechies ........................................................... 101
Figure 5:34 ARIMA applied on LL segment of Daubechies for segment-based predictions 102
Figure 5:35 ARIMA applied on LH segment of Daubechies for segment-based predictions
................................................................................................................................................ 102
Figure 5:36 ARIMA applied on HL segment of Daubechies for segment-based predictions
................................................................................................................................................ 103
Figure 5:37ARIMA applied on HH segment of Daubechies for segment-based predictions 103
Figure 5:38 Segmentation of Time Series Data using HAAR decomposition fed to cuckoo
search and further to NN ........................................................................................................ 104
Figure 5:39 Training Structure of ANN................................................................................. 106
Figure 5:40 Performance........................................................................................................ 107
Figure 5:41Training State ...................................................................................................... 108
Figure 5:42 Regression .......................................................................................................... 109
Figure 5:43 Differencing Applied on non-stationary LL data segment of Cuckoo-NN ........ 110
Figure 5:44 Differencing Applied on non-stationary LH data segment of Cuckoo-NN ....... 110
Figure 5:45 Differencing Applied on non-stationary HL data segment of Cuckoo-NN ....... 111
xiii
Figure 5:46 Differencing Applied on non-stationary HH data segment of Cuckoo-NN ....... 112
Figure 5:47 ARIMA applied on Cuckoo-NN optimized LL segment for segment based
predictions .............................................................................................................................. 112
Figure 5:48 ARIMA applied on Cuckoo-NN optimized LH segment for segment based
predictions .............................................................................................................................. 113
Figure 5:49 ARIMA applied on Cuckoo-NN optimized HL segment for segment-based
predictions .............................................................................................................................. 113
Figure 5:50 ARIMA applied on Cuckoo-NN optimized HH segment for segment based
predictions .............................................................................................................................. 114
Figure 5:51 Electricity consumption Actual and Predicted using ARIMA of the Year 2013115
Figure 5:52 Electricity consumption original and Predicted using ARIMA of the Year 2014
................................................................................................................................................ 116
Figure 5:53 Electricity consumption actual and predicted using ARIMA for Year 2015 .... 117
Figure 5:54 Electricity consumption Actual and Predicted using ARIMA for year 2016 ..... 118
Figure 5:55 Electricity consumption original and Predicted using ARIMA of the Year 2017
................................................................................................................................................ 119
Figure 5:56 Electricity consumption original and Predicted using ARIMA with DWT of the
Year 2013 ............................................................................................................................... 120
Figure 5:57 Electricity consumption Actual and Predicted using ARIMA with DWT for the
Year 2014 ............................................................................................................................... 121
Figure 5:58 Electricity consumption Actual and Predicted using ARIMA with DWT for the
Year 2015 ............................................................................................................................... 123
Figure 5:59 Electricity consumption Actual and Predicted using ARIMA with DWT for the
Year 2016 ............................................................................................................................... 124
Figure 5:60 Electricity consumption original and Predicted using ARIMA with DWT for the
Year 2017 ............................................................................................................................... 125
Figure 5:61 Actual and Predicted values of electricity consumption using Proposed Hybrid
Model for the Year 2013. ....................................................................................................... 127
xiv
Figure 5:62 Actual and Predicted values of electricity consumption using Proposed Hybrid
Model for the Year 2014. ....................................................................................................... 128
Figure 5:63 Electricity consumption Actual and Predicted using Proposed Hybrid Model for
the Year 2015 ......................................................................................................................... 129
Figure 5:64 Electricity consumption Actual and Predicted using Proposed Hybrid Model for
the Year 2016 ......................................................................................................................... 130
Figure 5:65 Electricity consumption original and Predicted using Proposed Hybrid Model of
the Year 2017 ......................................................................................................................... 131
Figure 5:66 Overall comparison of electricity consumption prediction of ARIMA +DWT,
Hybrid proposed model with the original dataset .................................................................. 132
Figure 5:67 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model
................................................................................................................................................ 135
Figure 5:68 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model
................................................................................................................................................ 135
Figure 5:69 Computed Accuracy (%) for ARIMA, ARIMA with DWT, ARIMA with HAAR
and Proposed Hybrid Model .................................................................................................. 136
xv
LIST OF TABLES
Table 5.1 Dataset Used……………………………………………………………………….75
Table 5.2 Original and predicted electricity consumption using ARIMA ............................. 114
Table 5.3 Actual and predicted electricity consumption using ARIMA ............................... 116
Table 5.4 Original and predicted electricity consumption using ARIMA of the year 2015 .. 117
Table 5.5 Original and predicted electricity consumption using ARIMA for year 2016 ...... 118
Table 5.6 Actual and predicted electricity consumption using ARIMA of the year 2017 .... 119
Table 5.7 Original and predicted electricity consumption using ARIMA with DWT for year
2013........................................................................................................................................ 120
Table 5.8 Actual and predicted electricity consumption using ARIMA and DWT for year
2014........................................................................................................................................ 121
Table 5.9 Actual and predicted electricity consumption using ARIMA with DWT of the year
2015........................................................................................................................................ 122
Table 5.10 Actual and predicted electricity consumption using ARIMA with DWT for year
2016........................................................................................................................................ 123
Table 5.11 Actual and Predicted electricity consumption using ARIMA with DWT for the
year 2017 ................................................................................................................................ 125
Table 5.12 Actual and predicted electricity consumption using Proposed Hybrid Model for
the year 2013 .......................................................................................................................... 126
Table 5.13 Original and predicted electricity consumption using Proposed Hybrid Model for
the year 2014 .......................................................................................................................... 127
Table 5.14 Actual and predicted electricity consumption using Proposed Hybrid Model for
the year 2015 .......................................................................................................................... 128
Table 5.15 Actual and predicted electricity consumption using Proposed Hybrid Model for
the year 2016 .......................................................................................................................... 130
Table 5.16 Actual and predicted electricity consumption using Proposed Hybrid Model for
the year 2017 .......................................................................................................................... 131
xvi
Table 5.17 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model134
Table 5.18 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model
................................................................................................................................................ 135
Table 5.19 Computed Accuracy for Different Combinations ................................................ 136
xvii
ABBREVIATIONS
GUI- Graphical User Interface
RAM- Random Access Memory
MATLAB- Matrix Laboratory
ARIMA - Auto Regressive Integrated Moving Average
DWT - Discrete Wavelet Transformation
CSA - Cuckoo Search Algorithm
ANN - Artificial Neural Network
1
CHAPTER 1: INTRODUCTION
Different organizations and research institutions use forecasting in their daily
routine. Prediction means estimating the future behavior of a variable based on its
experience. Due to the increase in the demand for electricity, forecasting the demand
and supply of electricity has special significance. Forecasts can be made based on
the relationship between electricity consumption and it's identified variables.
Various methods have been utilized for this purpose. In year 2000, World Energy
Outlook(WEO) predicted that the electricity-based projects consume half of the total
enrgy requirements in the world and electricity demand will grow at a rate of 5.4%
per year from 1997 to 2020 which is faster than the 4.9% growth rate assumed by
GDP of India. Generally, a highly positive correlation is observed between a
country's GDP growth and its consumption of electricity. This is a good indication
that the efficiency of electricity consumption is increasing. In India, this correlation
has declined. The elasticity of electricity consumption relative to GDP shows this,
and elasticity has declined over the years. During the first five-year plan period,
electricity consumption increased by 3.14%, while in the eighth plan period, it only
increased by 0.97% of GDP [1].
1.1 BACKGROUND OF THE RESEARCH
Energy is a foremost part of the socio-economic and political world in which we
live. In planning the future, we must pay full attention to the energy factor, because
the energy system affects the economic system very carefully stated by Hutberin
1984. Also, similar economic characteristics and the same energy scenarios show
that energy plays a crucial role in the economic development of developing
countries given by Ebohon in 1996 [2].
Indian Government has been making future plans every year for its development. In
all earlier projects, the authorities planned to minimize the gap between electricity
production and consumption. In the procedure to facilitate the creation and
utilization of the plan, the competent authorities must provide an energy
consumption estimate for the next upcoming years but none has mentioned any
2
methods for estimation/forecasting purposes. It turns out that there is always a
massive gap between their predictions and actual consumption. The Time Series
Data approach is utilized to guess the output that can be either volatile or non-
volatile [3]. The example for volatile and nonvolatile information includes wind
speed and environmental temperature respectively. The detailed description of Time
Series Data along with different types is discussed in the following section.
1.2 TIME SERIES DATA
Time Series Data is a series of numerical values that change over time and can be
represented in the form of a table containing a point at a particular instance, which
represents the Time Series Data value at that instance. Most of the time, it is
described in a rectangular plot, in which the X-axis represents time and the
Y-axis represents Time Series Data values. Some examples of time series data are
listed below:
1. Internet traffic per second is diverse, as it contains the number of bytes
transmitted per second.
2. The gold price is also a function of days as it is different every day and hence
is an example of Time Series Data.
3. Commodity cost as a function of day
4. The child growth (height versus age)
5. Number of babies born every second
6. Tree planted per day
Except for the examples as mentioned above, prediction of electricity consumption,
phone calls, temperature as well as the forecast of stock market data are also
examples of time series data. None of these time series have the same origin. Based
on the change in conditions, the Time Series Data also varies [4]. Some distinctive
taxonomies of Time Series Data values based upon their nature are explained below,
along with the description of the separate examples.
1.2.1 Linear vs. Non-linear
According to the feature values of the tested data, the Time Series Data can be
classified as linear/ nonlinear.
3
i. Linear
If the values of the variable changes linearly i.e. the Y-parameter changes with
respect to the X-parameters, it is known as linear Time Series Data. The example of
linear Time Series Data is shown in Figure 1.1.
Figure 1:1 Child's Height vs. Age (Linear Time Series Data) [5]
Figure 1.1 represents the graph between the heights of the child concerning its age
(1 to 10 years). The estimated and the examined height values are displayed by
dotted and the dark black line, respectively.
ii. Nonlinear
It is investigated that all Time Series are not linear. In fact, over a short duration,
many of the available Time Series can be roughly approximated linearly with
significant errors. An example of non-linear Time Series Data is Figure 1.2.
Figure 1:2 Sea Temperature vs. year (non-linear)
Figure source: https://www.epa.gov/climate-indicators/climate-change-indicators-sea-surface-temperature
4
Figure 1.2 represents the example of non-linear Time Series Data of sea
temperature concerning the number of years. The dotted lines represent the average
value, whereas the predicted values are described by the solid red color line [6].
1.2.2 Periodic vs. Non-periodic
As we know that the values of time series data are changing continuously as well as
in irregular intervals of times, the time series data can be categorized as periodic and
non-periodic.
i. Periodic
The example of periodic Time Series Data is represented in Figure 1.3. The graph
represents the graphical representation of sunspot numbers that recur at the regular
instant of time. The graph shows the seasonal dependencies in terms of Time Series
Data. Other examples of periodic data are road traffic at a specific time in a day;
climate changes appear as per the seasons [7].
Figure 1:3 An example of Periodic TSD [7]
ii. Non-periodic
The data which is not repeated at a particular instance of time is known as non-
periodic Time Series Data. Also, in Figures 1.1 and 1.2, the example of a non-
5
periodic Time Series Data signal is represented in terms of a graph. Also, financial,
as well as stock market data, comes under non-periodic Time Series Data [8].
Figure 1:4 Stock Market Prediction as non-periodic Time Series Data
Figure source: https://www.datacamp.com/community/tutorials/lstm-python-stock-market
The stock market price changes along with the date from the year 1970 to 2017 are
shown in Figure 1.4.
1.2.3 Gaussian and Non-Gaussian
The values of time series data that comprise of the standard/Gaussian distribution
curve are known as Gaussian Time Series Data, otherwise called Non-Gaussian
Time Series Data [9]. The example of Gaussian and Non-Gaussian distribution is
represented in Figure 1.5 and Figure 1.6, respectively.
Figure 1:5 Gaussian distribution example
6
Figure 1:6 Non- Gaussian Distribution example
Figure source: https://www.quora.com/What-is-an-example-of-a-dataset-with-a-non-Gaussian-distribtion
1.2.4 Low Volatile and High volatile
If the conditions change gradually concerning the time they are known as “Low
Volatile” Time Series Data; otherwise, known as “high volatile” Time Series Data.
The example of low and high volatile is shown in Figure 1.7 and Figure 1.8,
respectively.
Figure 1:7 Low Volatile Time Series Data
Figure source: https://www.equitymaster.com/indian-share-markets/11/10/2017/Sensex-Finishes-Marginally-
Higher-SBI-Surges-62
The crude oil price varies for the months of the years 2016 and 2017 as shown in
Figure 1.7.
7
Figure 1:8 Highly volatile Time Series Data
Figure source: https://economictimes.indiatimes.com/wealth/invest/7-stocks-with-high-1-year-upside-potential-
valued-on-basis-of-peg-ratio/articleshow/71551414.cms
The area of research in the prediction of Time Series Data in today's era becomes
essential due to its reliability of forecasting in enormous area of applications. Few of
them are discussed below:
The forecasting mechanism is helpful for internet service providers by which
they manage the available bandwidth and provides more effective solutions
to users.
In the agriculture sector, the prediction of variation in climate becomes
helpful e.g. to forecast the rainfall provides help to farmers to check if the
weather is suitable for farming or not.
According to the upcoming trends of marketing, marketers can easily decide
the perfect time and amount to invest for a specific purpose.
For the prediction of natural disasters such as tsunami, earthquakes, cyclones
and floods, etc. early prepration can be done to avoid the destruction [10].
There can be various prediction classes based on its usage in different areas, as
pointed below:
1. 1-step ahead prediction: In this type of prediction, it's compulsory to predict the
next single value based on presently available data; it must go to next after
completing the one cycle or horizon of prediction. The typical example of this type
of speculation can be a weekly background of prediction. By taking the Time Series
Data values to start from 1st day to 14th day, we can predict the 15th day consumed
electricity. Again by using the 2nd day to 16th day, we can predict the load
8
consumption of the 17th day. Finally, the total consumption of electricity in 7 days
can be predictable from day 14 to day 20 by using this process of one step prediction
by using the 7-day forecasting horizon.
2. N-step ahead forecast: In this type of speculation, N>1 means multiple values
are predicted using the currently accessible data, unlike the previously discussed
1-step ahead prediction. This prediction is also termed as “multi-step ahead” forecast
method and can be repeated after the completion of a horizon of speculation, and it
can be further categorized into
(a) Direct forecast: If we take N=2 and background for prediction is 5, by using the
Time Series Data values for the first and second day, the values for coming days can
be predicted directly. Similarly, by using Time Series Data values of the 3rd day up
to 7thday, the values for day 8th and 9th day can be predicted. This direct nature of
prediction is also known as “direct 2-step ahead” forecasting.
(b) Iterative forecasting: If N=2 and 5 has been considered as the prediction
horizon. Then we have to use the Time Series Data values of day 1 to day 5, firstly
Time Series Data value for day 6 is obtained. After that, utilizing the Time Series
Data values, starting from 2nd day to 5th day, and the amount that predicted for day 6
has been utilized to predict the value of day 7. Whereas, indirect forecasting to get
the value of day 7, the amount of day 6 is not required, and this process continues
throughout the given prediction horizon until the forecasted value for the 10th day is
generated. It can also be called as iterative 2-step ahead forecasting [11].
Components of Time Series
The analysis of time series produces a list of the scheme by which understanding
about datasets becomes better. Time series can be decomposed into these given four
parts.
Level: It represents the value of the baseline if the series is in a straight line.
Trend: This component is optional, by which the increasing and decreasing
behavior of series has been represented.
Seasonality: The discretionary way of repeating the behavior with respect to
time.
9
Noise: The variability that is optional comes in analysis, which cannot be
explained through the model. Among these components, the time series
involves necessarily is level, and these three, such as trend, seasonality, and
noise, are optional [12].
1.2.5 Advantages of Electricity prediction
Present research work focuses on the prediction of expected consumption of
electricity in the future. There are various benefits of the research, among them some
significant advantages are provided below:
i. Power utility companies have better understanding of the load demand or
electricity consumption in the future by which they become able to plan in a
better manner for future.
ii. The risk has been raising for energy companies. Knowing the possible long-
term load helps the businesses plan and make economically feasible
decisions about future demand as well as transmission in the region.
iii. Helpful in determining what resources are required, such as the necessary
fuels to run the generating plants as well as other resources needed to ensure
reliable and economical power generation and distribution to customers. This
is critical when it comes to planning small, moderate, and long data.
iv. Load forecasting is also helpful in future planning concerning the scale,
location, and design of the potential project to be produced. The utilities are
more likely to produce electricity close to the cost when they consider places
or regions with high or decreased demand. It minimizes the infrastructures
for transmission and distribution and the resulting losses.
v. Provide help in the decision making as well as planning of power system
maintenance. Through knowing the market, the company will know when to
conduct the support to make sure it has the least impact on the customers.
For example, during the day when most people are at work, and the demand
is deficient, they may want to do maintenance in residential areas.
10
1.3 TIME SERIES DATA PREDICTION MODELS
The prediction models are required to perform prediction or forecasting for any
provided Time Series Data. Among various Time Series Data prediction models,
some of them used in this present work are; ARIMA and ANN.
1.3.1 Auto-Regressive Integrated Moving Average (ARIMA) Model
The forecasting of time series is a scientific tool utilized to resolve the prediction
problems. Implementation of this model is easy and flexible as it needs only past
observation of the required variables. This model is a linear modeling scheme and a
combination of three components, such as AR, I, and MA which are briefly
described below:
Auto Regression (AR) is defined as the model that uses the observer-
dependent relationship and a few lagged observation numbers.
Integrated (I) to compute the differencing of raw observations, for example,
subtraction of representation from the previous view to make the stationary
time series.
Moving Average (MA) utilizes the dependency among observation and a
residual error of a moving average model that is applicable for lagged
observations.
ARIMA model was firstly deployed in 1976 by Box and Jenkin by using MATLAB
R 2012a. Model was proved useful for the preparation of data and computation of
autocorrelation function(ACF) and partial autocorrelation function(PACF). The
above three main components of the ARIMA model are clearly explained in terms of
the parameter of this model. A classical notation used for ARIMA is
ARIMA (p, d, q) in which the parameters are assigned with integral values, and the
notation used is illustrated below:
p: represents the observed values for lag involved in the prediction model.
d: denotes the amount of time obtained after the difference between raw
views, also known as“degree of variation”.
q: is the moving window dimensions and is known as “degree of moving
average”.
11
The ARIMA model is an extended version of the ARMA model (used only for
stationary Time Series Data); it becomes stationary through utilizing finite
differentiation of data points. A mathematical explanation of this model by utilizing
lag polynomials is given as below;
or, (1.1)
, (1.2)
In which the integer’s are described below:
The parameter d is used to control the level of difference, usually .
If , then the model produces in the form of ARIMA (p,q). If the model
is in the form of ARIMA(p,0,0) it provides nothing, but AR(q) and ARIMA
(0,0,q) is the MA(q) model.
means , this model is known as a Random walk
that is useful in the case of non-stationary data [13].
1.3.2 Cuckoo Search Algorithm
Cuckoo Search (CS) is a nature-inspired optimization algorithm, which is used to
find either minimum or the maximum value of the related problem using an
appropriate selection of function known as the objective function. In this research,
CS is used to minimize the irregularities in the collected data obtained after DWT.
We may assume that CS is used to further refine data so that the accuracy of the
forecasting method for electricity use can be improved. This algorithm operates on
the cuckoo birds' breeding strategy [14].
Cuckoo search is a nature inspired technique, initially designed by Yang and Deb in
2009. The concepts of CS are based on the bird named Cuckoo (figure 1.9). Due to
its beautiful sound as well as its aggressive reproduction scheme, cuckoos are
fascinating birds in which adult cuckoos lay their eggs in the nests of other species
or host birds.
12
Figure 1:9 Cuckoo bird
Image source: https://www.slideshare.net/AnujaJoshi6/cuckoo-o-ptimization-ppt
The concepts of cuckoos exist mainly in two forms i.e. adult cuckoos and eggs.
After the fertilization of eggs, a group is formed. Environmental characteristics,
together with the migration of organizations or cuckoo’s family, help to unite and
obtain the best environment for reproduction as well as for survival.
The basic of CS is based on three essential standardized rules;
1. Only one egg is laid by cuckoo at a time.
2. The nest which contains better quality eggs is contributed to produce the next
generation.
3. The host's nest is fixed and the laid eggs are discovered by the cuckoo using
probability Pa [0,1].
Based upon the three defined rules, the eggs are either thrown away from the nest or
the nest is abandoned and a new nest is formed [15].
1.3.3 Cuckoos behavior for egg-laying
In some other host bird's nests within the given Egg Laying Radius(ELR) each
cuckoo is initialized to lay eggs randomly. After having laid all the eggs in the host
bird's nests, some of the eggs which are less close in appearance to the eggs of the
host bird are thrown out of the nest after being detected by host birds. With this,
mostly 10% of eggs are killed after each egg-laying cycle. These eggs are gone, and
there is no way they will grow[16]. When the eggs hatch then the cuckoos eat major
portion of the food the host bird is taking to the nest. After a while the host birds'
chicks starve to death, leaving only the cuckoo chicks in the nest.
13
Figure 1:10 Representation of a nest solution in the Cuckoo search algorithm
Image source: https://www.scielo.org.za/scielo.php?script=sci_arttext&pid=S2224-78902013000300017
Cuckoo Search has attracted a lot of attention due to its simplicity and ability to
solve several optimization problems with numerous applications around the world.
The development of cuckoo behavior mainly affects cuckoo Search, i.e. the
placement of colorful patterns that mimic other bird's nests. Every single egg in the
nest is a solution although essentially the cuckoo egg is a new solution. Using new
and better solutions (i.e. cuckoos), algorithm's main emphasis is on modifying the
worst solution within the slots [17].
Based on the above three laws of the cuckoo hunt, the likelihood is that the host bird
will be able to throw the eggs out of the nest or simply dig the particular nest and
then create a completely fresh nest again. An important problem of CS algorithm is
the application of Levy flights to create new solutions,
(1.3)
Here is either taken from the standard normal distribution with respect to zero
mean and the standard aberration for random walks or the Levy distribution for Levy
flights.
Cuckoo egg
Host birds egg
Host birdsnest
X1 X2 Xn-1 Xn
Next solution
14
Figure 1:11 Flow chart of Cuckoo search optimization algorithm [21]
Afterwards, a random population generation can also be associated with the
similarity between the eggs of the cuckoo and the eggs of the host. The phase size
Generation of N host nest and assign the position to
each nest
Evaluation of healthy function for each nest
Bring out Levy flight to get new nests position and
evaluate its fitness
Compare the fitness of new and old nest, and
If <
A fraction of the worst nests are replaced by new
nests randomly
Compare newly searched nest with worst discovered nest and save the best nest
If max iteration reached
Take the best nest as outcome
Yes
No
No
Yes
15
‘s’ then determines the distance a random walker can cover for a given number of
repetitions [18]. Two components i.e. local and global hare consists of the CS
algorithm. The former is designed to improve the best alternative through a guided
random walk, while the latter is intended to maintain population diversity through
Levy flights. The probability of changing ‘Pa’ regulates a balance between the two
stochastic search parts.
Pseudocode of Cuckoo Search Algorithm:
Begin Initialize the N number of nests as population. Calculate Nests. While (the Criteria of terminationis not achieved) Generate randomly a novel solution i among the best nests Select nests j from population If quality of i is superior than j, replace it with solution Replace abandon nest with randomly generated nests Calculate nests.
The CS algorithm begins with a random population initialization and random
solutions generated. In comparison to its fitness function, each solution or nest is
assessed appropriately to find the possibly best solution or nest [19].
For an iterative process, the criteria for termination consist primarily of the
predefined number of generations, the maximum time allowed, or inactivity at the
end of the process. The new solution is generated corresponds to every iteration by
arbitrary walking or charging flights to the best solution. If these solutions are
superior to the solutions chosen at random, they will be replaced by the population.
The dumping of the worse solutions or nests and replacing them with the randomly
generated solutions is performed in proportion to the given probability ‘Pa’ . To find
the best alternative, the entire population is reassessed, and this method lasts until
the end requirements are met. The reason for applying CS in the designed electricity
Consumption prediction model for Punjab is because of its high convergence speed
to reach the optimal solution [20].
16
1.3.4 Artificial Neural Networks (ANNs)
Artificial neural networks (ANNs) are flexible in computation and most applied
predictors that are applicable in various time series forecasting issues along with
improved performance. ANNs are used for forecasting in various application areas
related to engineering, social sciences, foreign exchange, stock problems,
economics, and so on. Some characteristics of ANNs that makes it attractive and
valuable in forecasting task are as follows:
(a) model is not based on traditional method
(b) network can be generalized easily means it's possible for ANN to accurately
infer the unseen part of a population even if noisy information is present in data
(c) forecasting model can easily estimate any continuous function to suitable
accuracy, and ANNs are non-linear, unlike the ARIMA model [21].
1.3.4.1 Time Series Data modeling using ANN
The model of ANN is highly influenced through the features of data; the most
suitable of ANN is used for forecasting, and modeling is termed as the single hidden
layer feed-forward network. Usually, the model is categorized through three layers
of network that are linked via acyclic connections [22]. The mathematical
representation of the input, i.e., yt-1,……..,yt-p and output (yt) of this network is
provided as;
, (1.7)
Here, wi,j in which i varies from 0 to p and j varies from 0 to q, is the parameter of
the model, also known as connections weights. The terms p and q show the number
of input and hidden nodes; the activation function is also utilized in different forms.
Activation feature modes are defined by the neuron condition within the network.
The activation function is often not present in the input layer, although the activation
function's job is to move the information to the hidden layer. The most suitable
activation function in the output layer is the linear function since the non-linear
activation induces distortion in the forecast data [23]. The activation functions used
in the hidden layers are logistic and hyperbolic functions that are used to transfer
function as shown in the equation below;
17
(1.8)
(1.9)
So the ANN performs a non-linear functional mapping by using previous data to get
the future value (output).
and the output, (1.10)
The ANN is created from hundreds of units, artificial neurons, also known as
processing elements, which are related to weights, and creates the neural structure
and are arranged in layers [24]. The mathematical model of ANN is shown below in
figure 1.12.
Figure 1:12 Model of ANN [25]
For processing the input data according to the weighted function, the processing unit
is used. It composes a mathematical equation which helps in balancing the input
along with output data. As shown in figure 1.13, which mainly includes the input
signal and weight value, has been multiplied which performs addition operation and
output for that specific neuron. The sigmoid function is the most used activation
function, which used to perform the weighted sum of input neurons [25]. The
obtained result is passed to the transfer function and the layer structure of ANN as
shown in figure 1.13.
W1
X1
Y Outpu
WX2
X
W
Weig
Inpu
18
Figure 1:13 layered architecture of Feedforward network [26]
i. Input layer: The input data like optimized Time Series Data is passed to this
layer. This layer mainly consists of a node that is not involved in the
modification of signal means the nodes only forward data to the next or the
hidden layer.
ii. Hidden layer: Multiple neurons are included in the hidden layer, and the
nodes of this layer modify the signal, so they are called active nodes. Every
active node is involved in an active part of modeling.
iii. Output layer: The modified neurons are obtained at this layer and represents
the achieved output.
The ways neurons are linked with each other have an impact on the operation of
ANN. The neurons received excitatory input or inhibitory input; it does not matter
neurons may be real or artificial. In the computation of addition operation, excitatory
input neuron is useful, and for the subtraction operation, inhibitory neurons are used
[26]. These three processes are the feedback mechanism having the connection
through a path from the output layers going back to the input layer, and the
Feedback architecture is shown in figure 1.14.
X1
X2
X3
X4
Y1
Y2
Input layer Hidden layer
Output layer
19
Figure 1:14 Feedback network [26]
1.3.4.2 Estimation of the model parameter
To evaluate the parameters of the prediction model, it's not necessary to include only
an individual to optimize and compute the parameters in ANN, using nonlinear
equations. Therefore, the parameter assessment is done through training algorithms
by considering a sequence of data or training data that are fed as input data to ANN.
ANN can tell what will be the output in the future, and the ANN has been trained
based on the data at which the generated error is minimum. The minimization of the
error is performed by changing the Mean squared error (MSE). The weights and bias
values continue to change during the training period until the Time Series Data
values are improved.
1.3.4.3 Model validation and prediction
The ANN training is performed in an iterative manner, which means as the number
of iteration increases, the amount of the MSE values gets reduced. Once the error of
validation have been calculated and validation of the model is completed, this model
is used for Time Series Data prediction [27].
1.4 DECOMPOSITION BASED PREDICTION MODELS
In Time Series Data prediction, pre-processing technique becomes helpful in
improving the efficiency of forecasting. There are several kinds of pre-processing
methods available, among them, the best suitable technology to decompose has been
X1
X2
X3
X4
Y1
Y2
Input layer Hidden layer Output layer
20
applied here. The term decomposition means the process of decomposing the
original Time Series Data data into multiple components. The splitting can be done
based on some behavior of each decomposition in which filtering is the most used
approach [28]. Here in this research work, MA filter based decomposition along
with the wavelet-based decomposition have been applied, which is described below.
1.4.1 Decomposition based on Moving average (MA)
Considering a non-seasonal time series data upon which decomposition is done
based on Moving average (MA), it could be categorized into averaged or smoothed
components known as the trend and noise or residual component known as a
detrended component.
Time Series Data
Decomposition criterion met?
Fix MA Filter of length m
Trend Component Residual Component
Subtractor
no
Yes
Decomposition using MA filter
Figure 1:15 Time Series decomposition using MA Filter [29]
This technique is represented in figure 1.15. The trend component , and noise
component as expressed in given below equation (1.11) and (1.12) correspondingly.
21
The length of the MA filter could be selected to satisfy a certain decomposition
scheme for a specific work [31].
(1.11)
(1.12)
1.4.2 Discrete Wavelet Transform (DWT)
DWT is the most commonly used decomposition tool that helps researchers to get
meaningful information in terms of time and frequency about the deep signal. Using
this approach, the issues of localization, which exist in the Fourier transform has
been resolved. It is a scientific tool that transmits the input to a different field for
signal processing and analysis. The data in terms of time and frequency domain is
appropriate for predicting time-unstable autocorrelation function and unstable
processes. The data related to weather or finance are both unstable as their values
changes over time [30]. To predict such kind of data DWT is an appropriate tool and
the wavelet function considered for a single entity is written in equation 1.13.
(1.13)
In Equation 1.13, ‘c’ denotes the scaling factor that calculates the minimum value of
the given data. The ‘d’ functions are known as the translation parameter and are used
to evaluate the location of the wave time series. The wave is compressed while the
condition | c | <1, is satisfied. The wave is a compressed version that is connected to
higher frequencies (multiple time cycles). On the other hand, if | c |> 1, means that
the signal width of the generated signal is larger than (t), which is directly related
to the low frequencies _ (c, d) (t). Therefore, DWT is an essential tool, utilized to
evaluate the time series based on the wavelet information similar to discrete. It is
mainly depending upon the coded sub bands and compute data in less time. The
process of using the DWT technique is a simple and easier process. The scaling
parameters like ‘c’ has been implied in terms of 2-p as well as other parameters such
22
as l2-p, where, L I for the original dataset (Ap) [31]. The DWT function can be
signified by equation (1.14).
(1.14)
1.4.3 Trend-ARIMA model
A composite method having the decomposition based approach on MA filter is
utilized to process data to remove artifacts so that the ARIMA model can easily
execute data. Some important points included in this model are given below:
Decomposition: The decomposition technique based on MA filtration is used on the
datasets. The trend component (st), and the noise component (rt) has been obtained,
and the yt = st + rt is the original Time Series Data.
ARIMA modeling: After getting the decomposition trend and noise, this model is
applied on the decomposed data. It is not necessary for this model to have same
values in both the cases i.e. trend components and the noise component.
Predictions: After applying the ARIMA model to fit on the trend component,
prediction has been obtained. It could be represented as st,pre. The obtained noise
components after utilizing this model are represented as rt,pre. Finally, the prediction
was done by adding the obtained value corresponding to trend and noise prediction.
This obtained prediction acquires higher efficiency as compared to the basic
ARIMA model [32].
pre (1.15)
1.4.4 Wavelet-ARIMA model
This model is also utilized as a composite prediction model, in which the
decomposition technique is applied based on wavelet. After that it is used for pre-
processing by which it becomes the best fit in the ARIMA model and gets more
accurate forecasting. Apart from the moving average (MA) filter, the available
wavelet filters are HAAR, db1, db2, db3, db4 and db5. At last, the obtained data is
decomposed into detailed components. More than one level of decomposition is
23
available, as shown in figure 1.16 at the initial level; the Time Series Data is filtered
by applying any one of the discussed wavelet filters [33].
Figure 1:16 Wavelet decomposition [33]
As shown in Figure 1.16, the low and high frequency components are represented by
ya1 and yd1 respectively. The main components are again split into low and high
frequency components are represented by yd1 and yd2 respectively. After doing
these two levels of decomposition, the final obtained parts are ya2, yd1 and yd2.
From these, the final approximate components are represented by ya2, yd1 and yd2.
This decomposition could be further decomposed into further levels according to the
types of Time Series Data [34]. Based on these, Wavelet based ARIMA model
some essential points are discussed below:
Decomposition: By using the wavelet-based decomposition, the dataset is
decomposed into the required number of levels based on input Time Series Data
data.
ARIMA modeling: ARIMA modeling has been applied on every decomposed
output data i.e. ya1, yd1 and yd2 so that best fit value can be obtained.
Prediction: The prediction of the original data was calculated by adding the
projections of all the fragmented components given in the following equation, and
the performance was increased relative to the basic ARIMA model.
+ (1.16)
This form of decomposition is the expanded version of the pattern dependent
ARIMA; here the details on patterns and objects are collected during the
decomposition of the usable dataset which represents a low-frequency portion of a
24
data collection, so noise data is consistent with high-frequency components. Trend
and residual data are close to the first stage of separation, and pattern decomposition
may be further applied to boost the precision of the methods [35].
1.5 CLASSIFICATION TECHNIQUES
There are vast amounts of data, which are being collected and stored in the databases
across the globe. The classification is the technique used for finding the classes of
unknown data. The classification algorithms follow three different learning
approaches, namely, supervised classification approach, unsupervised classification
approach and semi-supervised classification approach.
1.5.1 Supervised classification approach
It is the task of machine learning to infer a function through labeled training data,
which composed a training example set. In this method of learning, every example
involves a pair of an input object and the value of the required outcome. It also
analyzes the data provided for training and generates an inferred function that could
be utilized to map obtained examples. To construct a model that is used to make
predictions based on evidence in the presence of uncertainty is the main aim of
supervised machine learning. Computers learn things from the observations similar to
the patterns of data identified through adaptive algorithms [36]. If more
considerations are exposed, the predictive performance is improved by computers.
Figure 1:17 Supervised learning [36]
In this type of learning mechanism, the algorithm works by using the examples of
those class labels that are previously known to the users. The system's ideal
Known data
Known responses
Model
Model
New data Predicted Responses
(a)
(b)
25
feedback is received at the moment when the data is applied. At any time it tries to
predict this function, the outcome correlates to the known case. The system
compares their forecast with established outcomes and learn from errors, and the
weight is often handled in such a manner that the gap between necessary input and
definitive output is lowered. For regression exercises, the values used can be
conditional or numerical.
Figure 1:18 Supervised learning process [37]
In this learning strategy, mapping is done from input to an output and the correct
values of the production, i.e., known labels are provided through the supervisor [36-
37].
1.5.2 Unsupervised learning
During this process, the desired goal is unknown, therefore, to improve the behavior
of the network, the error message cannot be used, unlike the supervised learning
mechanism. The learning of the model can be performed based on observations of
the actual inputs as there is no knowledge of the perfection or incorrectness of the
response.
Figure 1:19 Unsupervised learning [38]
As shown in Figure 1.19, there is no feedback algorithm used in this learning
process. So, in this learning process, the neurons itself get the patterns and features
from the input data and find the relationship among those data points. The un-
labeled instances of data is included here and the learning is provided. Here no
Learning system
Input
Training data
Actual Output
Σ Desired Output Error
signal
+ -
Neural Network
X (Input) Y (Actual output)
26
supervisor is provided but only input data is given. The main aim of this learning
scheme is to compute the regularities among inputs and determine the data
organization. One of the examples is the density estimation of unsupervised
learning. The primary method of density estimation is clustering, which is used to
compute the clusters or grouping of inputs [38].
1.5.3 Semi-supervised classification approach
Semi-supervised learning approach utilizes labeled or annotated as well as unlabeled
data unlike supervised (in which data is all labeled) and unsupervised (where data is
all unlabeled). This algorithm provides labeled data in a small amount and unlabeled
data in a considerable amount. In supervised machine learning, it is a challenging
task to get the labeled data because it uses only marked and supervised data (label or
feature pairs). Obtain labeled information is time-consuming as well as an expensive
process since it requires more experienced human annotators. Whereas it's easy to
get the unlabeled data, but there are only few ways to use them. The semi-supervised
learning mechanism is used to solve this problem. Lots of improvement in learning
accuracy is provided by using a considerable amount of unlabeled data combining
with labeled data to obtain better classifiers. These learning mechanisms are of great
utilization as they offer high accuracy along with reduced human labor, and it can be
available in two forms inductive as well as transductive. The algorithm works only
on the labeled and unlabeled training data and cannot handle unseen data in
transductive learning [39-41].
1.6 PROBLEM STATEMENT
Punjab has been a state of capitalization and growth since last decade. A lot of
technical growth has been observed and a lot of capitalization and industrialization
has been attained. Due to increasing technical and commercial growth electricity
consumption has increased dramatically. The Punjab government has established a
unit to see the power demands and to monitor the consumption in Punjab and is
termed as “Punjab State Power Corporation Limited” (PSPCL). PSPCL has many
cities followed by tehsils to monitor. Some cities consume more electricity and it
becomes a little difficult to manage and to supply electricity to villages at the same
time. Nowadays, the proper utilization, as well as operation issue of electricity
27
consumption, becomes a major challenge in research areas. The economics of the
production of electricity has been changed due to the reason of new electricity
market strategy of electricity consumption. The pre-planning, operational strategy,
and managing skills of interconnected load systems provide a number of challenging
issues. The industry of electric power is changing rapidly along with the need for
computation of the operating strategy to satisfy the consumption of electricity which
is the most essential concern. To satisfy the utilization demand of consumers is a
great challenge. Lots of research work has been done previously to forecast the
future consumption of electricity. In this research work, ARIMA model is utilized to
predict the future consumption of electricity. To determine the minimum and
maximum utilization of electricity from the previous year’s electricity based data the
decomposition mechanism i.e., DWT is applied. The Cuckoo search optimization
along with Neural Network is used to get optimized data and classify them. To
analyze the performance of the proposed work MAP, MAPE and accuracy
parameter has been computed and compared with existing algorithms i.e. ARIMA
and ARIMA with DWT is compared with the proposed hybrid technique using
ARIMA, DWT, CS and ANN.
1.7 RESEARCH OBJECTIVES
Broad objective of this research is to develop a novel hybrid forecasting model or
algorithm to forecast the electricity consumption of Punjab, India with high
accuracy. There exist more than 50 algorithms in the Swarm Intelligence category. It
is very complicated to develop a new algorithm under this architecture and hence in
most of the standard papers, the new behavior is developed or evaluated under the
given category. In the present study, A new fitness function is designed and
developed for the fitness function of the Cuckoo Search algorithm and the
combination of the Feed Forward Back Propagation algorithm with DWT and
Cuckoo which has been never observed before. In addition to this, the way in which
the proposed architecture is applied is quite different from what has been seen in
previous models and development. The specific objectives of this research are as
follows :
1. To model the time series data and forecast future values using ARIMA.
28
2. To determine how discrete wavelet transform, resolve the difficulties in
ARIMA modeling.
3. To develop a hybrid model of ARIMA and wavelet transform.
4. To analyze the performance and accuracy of the proposed algorithm.
1.8 RESEARCH GAPS
1. The accuracy of forecasting model of electric load can be improved by
hybridizing neural network with wavelet transform [116].
2. The forecasting accuracy of forecasting model of electric load can be improved by
adding more than one factor like temperature, humidity, etc. [117].
3. For forecasting the electricity consumption instead of using simple random
forests, parallel random forests can be used [118].
4. To improve the forecasting accuracy of the model weekly clustering can be
performed [119].
5. The performance of Multi-Linear Regression (MLR) for forecasting electricity
consumption can be improved by incorporation of high spatial resolution [120].
6. The accuracy of electric load forecasting model can be increased by developing
homogeneous ensemble model based on support vector regression can be developed
and evaluating them with different [121].
7. To improve the accuracy of the ARIMA model, it can be combined with wavelet
transform [122].
29
CHAPTER 2: PRESENT STATE OF ART
Electric power performs a critical role in economic and social enhancement along
with the improvement of the community and thus helps to improve the living
standards of persons. Energy consumption is rising in all areas of the world. In
today's era, urban areas make consumption of 67% of the overall world's energy
consumption. So, based on urban complexities and future performance, it needs
accurate information based on the existing patterns of power consumption. This
chapter deals with the relevant works on the strategies of electricity consumption
and load prediction [42].
2.1 HISTORY OF ELECTRICITY SUPPLY
The first power station was build by Thomas Edison in the city of New York that
was operated in 1882. In India the first hydroelectric installation was installed near a
tea estate at sidrapong in Darjeeling Municipality in 1897. Nowadays, the electrical
utility company named Power Grid Corporation of India Limited (PGCIL) is
responsible for production and distribution of almost half of the total amount of
produced energy from its transmission network on different levels of voltage. As per
the operating and planning purpose of the Indian power system, the power system is
characterized among five regional power grids, i.e., south (S), north (N), east (E),
west (W), and northeast and south areas [43].
In 2006, the remaining regional power grids were interrelated, apart from the
southern power grid. At the end of 2013, the southern region has been interlinked to
make the central power grid in synchronous mode, which got the concept of one-
country one-power grid one-frequency. The term CEA (Central Electricity
Authority) of India is a national contractual body which often recommends the
government of India on issues relating to the national electricity policy which
establishes short-term and long-term plans and sometimes even develops renewable
energy for the development of power devices [44].
30
2.2 PUNJAB STATE POWER CORPORATION LIMITED
Punjab State Power Corporation Limited (PSPCL) is the electricity generating and
distributing company of the Government of Punjab state in India. PSPCL was
incorporated as company in 2010 and was given the responsibility of operating and
maintenance of State's own generating projects and distribution system. The
business of Generation of power of erstwhile PSEB was transferred to PSPCL.
PSPCL has been developing the models for electricity prediction since its inception
and there are various factors that impact the consumption of electricity [45]. There
are numerous studies that highlights the electricity consumption models and needs of
electricity consumption. Few of the studies are reviewed below:
2.3 OVERVIEW ON LOAD FORECASTING
Numerous researchers have been working to develope the accurate prediction
models for consumption. Some of them are discussed as under:
Willis, H., & Aanstoos, J. (1979) introduced a variety of forecasting methods that
have been formed after continuous development and improvement, and
subsequently, their prediction accuracy has also improved. Spatial load forecast
work was concerned during the 1970s, and many methods were suggested, but the
value of land use types to increase predictability was not considered. This method
has been improved through the elements of fuzzy logic; various strategies are
available to forecast the consumption of electricity such as fuzzy multi-objective
decision making, cloud theory, and presently through mathematical approaches [46].
Fan, S., & Hyndman, R. J. (2010) presented a comprehensive state-of-the-art
exists for the prediction of electricity consumption, and unique perspectives have
been discovered. Statistical analysis is concerned with the estimation of system
demand and peak demand hourly, daily, weekly, and annual basis. Electricity
production is vulnerable to several uncertainties, including conditions of weather
(temperature, rainfall, humidity, and so on) growing population, technology,
economic conditions, and specific utilization irregularities [47].
Variety of transmission demand modeling approaches are available through
regression load forecasting to land-use modeling load forecasting, incorporating
31
good practices such as INSITE (long-term spatial load forecasting tool), and the
scheme of advanced distribution load forecasting namely Load SEER (Load Spatial
Electric Expansion and Risk) to prepare and document forecasting of consumption
and produce support in the analysis of managerial, public, and regulatory [48].
2.3.1 Type of Load Forecasting Technique
Topalli, A. K., & Erkmen, I. (2003) proposed hybrid learning neural networks for
short term load forecasting (STLF) is varied in between the hour to a week. This
work is done to predict the total consumption of electricity for the next single day in
Turkey. Initiated with weights randomly and obtained surprising prediction that was
not acceptable for the real-time operation. So by using available previous data, real
load data collected by the Turkish Electricity Authority will be used online, and the
model has been designed to do online forecasting.
Additionally, a method for clustering input data has been presented based on hourly
electrical consumption. Many alternate models have been created to have an
understanding of the model's performance. After clustering in the proposed model,
the average errors have been minimized to 9.5%, and in the case of hybrid failure
was 2.4%. Without clustering using off-line learning on the same datasets, the
obtained average error has 10.6%, so this proposed online forecasting have better
outcomes in contrast to the previous off-line, without clustering datasets [49].
Ghiassi et al. (2006) presented the development of a medium-term electrical load
forecasting (MTLF) dynamic artificial neural network model (DAN2). Accurate
MTLF offers details to utilities to help prepare power generation extension (or
purchase), schedule maintenance operations, conduct system improvements, discuss
potential contracts, and build cost-effective fuel procurement strategies. Introduced
an annual method that predicts future electrical requirements using previous monthly
system loads. Authors have also depicted that weather data inclusion enhances the
accuracy of load forecasting. Nevertheless, these models need precise weather
forecasts that are often difficult to acquire. Most of the utilized models have tested
through the Taiwan Power Company's actual device load information. All annual
and seasonal models yield mean absolute percent error (MAPE) values below 1%,
demonstrating the effectiveness of DAN2 in medium-term load forecasting to get the
32
outcomes, they have equated the results with multiple linear regressions (MLR),
ARIMA and a conventional neural network model [50].
Carpinteiro et al. (2007) used long terms forecasting of the load is defined as the
prediction of load behavior for the future. As per the period of time, it may be
further categorized into short, medium, and long-term. Short-term predictions
typically vary from an hour to a week, medium-term predictions normally vary from
a week to a year and long-term forecasts surpass a year.Electricity demand
forecasting has a significant short-term role as well as long-term forecasts for
planning future electricity generation forecasts. Long-term predictions can be
utilized for system planning, latest generation capacity building schedules, and the
purchase of producing units [51].
Amjady, N., & Keynia, F. (2008) proposed the Mid-Term Load Forecasting
(MTLF) model this work has focused on theprediction of daily maximum
consumption of electricity load for a month ahead of several kinds of MTLF. In this
load, forecasting has several usages such as operational schedules, medium-term
hydrothermal coordination, suitable assessments, management of limited energy
units, pre-contracting, and development of efficient fuel procurement strategies.A
nonlinear, volatile, and non-stationary signal is the daily peak load. Additionally,
this problem is usually further complicated by the lack of sufficient data. To
overcome this issue here presented, a new scheme composed of a data analysis
structure, prediction mechanism along with ANN, and an evolutionary based
optimization approach has been used. To observe the outcome has compared these
mechanisms with other MTLF methods showing its ability to overcome the concern
of load forecasting [52].
Soares, L. J., & Medeiros, M. C. (2008) deployed an improved version Seasonal
Integrated ARIMA (SARIMA) model for short term load prediction (hourly) to
forecasting the region located in the southeast of Brazil that is enclosed by the
electric utility. Per day various models have been built for every hour, based on the
decomposition of the regular series of every hour among two given two components.
The behavior of the first element is strictly probabilistic and more to do with
patterns, variability, and the impact of essential days. And the nature of the second
33
component is algorithmic, following linear modeling of autoregression (AR), here
the next step, non-linear options have been taken into account. The multi-step
forecasting performance of the proposed approach is contrasted with the various
existing system, and the outcomes depict that the solution of this work is useful in
the forecasting of electricity charges in the thermal conditions [53].
Pedregal, D. J., & Trapero, J. R. (2010) presented a multipurpose approach to
forecast load in an optimal way specifically at mid-term horizon hourly rate. This
scheme is an extended method for the previously defined short-term scheme, which
is again used to forecasting load and prices based on components that are not
observed. This approach involves estimating different models at different rates,
mainly monthly and hourly, for the same data sampled.The growing model
integrates the correct data characteristics due to its corresponding interval of
sampling, and both types of predictions are combined into a single forecast through
effective time accumulation strategies that participate in a computational complexity
structure being naturally implemented [54].
Darbellay, G. A., & Slama, M. (2000) compared different models that are
applicable for short term forecasting that has mainly solved a problem that is
previously faced by electricity suppliers. To overcome this issue, have used novel
schemes that have to solve the problem of non-linearity if it is present in current
work. Initially, they have introduced the non-linear measure of mathematical
dependencies. Then in the next authorshave observed the autocorrelation function of
Czech electric consumption that is linear and non-linear in nature. After that have
made a comparison of forecasting accuracy of the non-linear model, namely
Artificial Neural Network (ANN), along with the linear model, i.e., ARMA. It has
been analyzed that short-term forecasting evaluation of the Czech electric load has
been considered as a linear problem after done the comparison analysis [55].
El-Telbany, M., & El-Karmi, F. (2008) presented the outcome of forecasting
connected with a three-layer feed-forwarded neural network to predict the daily
consumption of electricity through considered several factors. These considerations
are; data from past production, time influences, and data from temperature. This
neural network (NN) research was conducted using particle swarm optimization
34
(PSO) and backpropagation (BP).For which PSO is a strategy of novel requirement
focused on the collective psychological model. The outcome of this trained neural
network is compared with the method of neural backpropagation (BPNN) and
autoregressive moving average (ARMA).In terms of the randomness of the neural
network trained by the BP algorithm on the comparable test results, the efficiency of
the PSO algorithm is improved compared to the BP algorithm.Particle swarm
optimization is a gradient-based algorithm that usually involves iterations of specific
functions to get an optimal outcome as opposed to the BP. And because of its
usefulness in looking for vast spaces and the potential to carry out a global quest for
the best forecasting model, it is a successful process [56].
Kandil et al. (2006) proposed an approach for performing short-term load
forecasting based on Artificial Neural Networks (ANNs). And have examined the
abilities of this modelwithout the use of load history input variable in the prediction
of electricity consumption, often various weather variables have been used
previously among them the only temperature has been considered here. And it has
also analyzed that there is no negative impact without considering the other
variables such as sky condition and wind velocity. The variables that are used
mostly named as an hour and day indicators, weather-related inputs, and previous
consumption. For training and testing, the weekly data has been taken for one
month. Before the generalization of data, it has to train these data. Here generalized
delta rule (GDR), also known as error backpropagation algorithm, has been utilized
to prepare the layered structure of ANN. In this proposed work, the enhanced
outcomes have been obtained through considering some points mainly such as; used
advanced kinds of ANN, the architecture of ANN is better, selectively better input
variables, and the selected training set [57].
Xiao et al. (2009) introduced a mechanism of rough set backpropagation (RSBP)
neural network (NN) in extensive Short-term load forecasting (STLF) with different
non-linear parameters to improve predictive accuracy.The STLF plays an important
role to manage electric consumption of any state having an insufficient amount of
electricity according to the requirement have been increased. The effect of noise
information and low interdependence data on BP is avoided by attribute reduction
based on parameter accuracy with the rough collection, thereby reducing the time
35
required for learning. They analyzed RSBP's efficiency by comparing its forecasts
with those of the BP network through the application of load time series from a
practical power system [58].
Catalão et al. (2007) presented an approach to predict ST electricity prices using a
neural network approach. For the last couple of years, energy supplies have
considered a public utility, and any cost prediction that has been done appeared to be
long-term, about possible fuel prices and technological upgrades. These days, due to
the increasing consumption of electricity, a short-term forecast becomes increasingly
important. Therefore, in this current environment, manufacturers and investors have
expected to obtain the procurement methods from the electricity market throughout
the shorter term. Precise forecasting resources are required for producers to increase
their income, disclose possible losses in relation to future demand volatility, and
optimize their services. The novel algorithm, namely Levenberg Marquard, utilized
a three-layer neural feed-forward network to predict electricity prices next week.
The authors have examined the performance of the demand forecasting with the
proposed neural network (NN) method [59].
Goude et al. (2013) have presented a semi-parametric solution based on the theory
of standardized regression processes to model consumption of electricity around
higher than 2200 French distribution system stations, both within the short and
medium-term. The association between the load and predictor variables has
determined by these simplified differential models such as temperatures, calendar
variables, and so on. This technique has implemented on the French grid, including
enhanced results. Authors have illustrated the fact that is required to estimate
functions described the demand-to-drive relationships have been interpretable and
that forecasting of temperature is essential. The obtained potential of this particular
scheme is to gather different consumption series (approximately 2000) analyzed
instantaneously on the French grid, without any human intervention [60].
Minaye, E., & Matewose, M. (2013) presented a practical approach that can be
used as a reference for building models of Jimma City Electric Power Load.
Trending statistical analysis methodology is involved in the work of load
characteristics and predictive precision, namely; linear regression, compound growth
36
model, and quadratic regression. Specific monthly and annual demand analysis has
been used as a particular work from the transformer of the Jimma distribution
system. By applying the optimized value of the coefficients of regression, and the
mean absolute percentage error, the growth of the compound method has been
utilized in the demand prediction for the upcoming five years. In this distribution of
city to predict, some specific techniques are used; linear trend, compound growth,
and quadratic regression methods. The performance has been analyzed based on two
parameters named as best rank correlation coefficient (���) and Mean absolute
percentage error (MAPE) [61].
2.4 CLASSIFICATION OF LOAD FORECASTING
Willis, H. L., & Romero, J. (2007) presented the definition of spatial load
forecasting approaches has evolved through the years when statistical instruments
and strategies from other fields of expertise were incorporated in their development
efforts by the electricity delivery utilities.The methods are grouped into three
categories to show the differences in the databases considered by different
methodologies, taking into account the classification. The given three methods are
non-analytic, trending, and simulation, as shown in figure 2.1. In the non-analytic
forecast, the data without considering the historical data of the past days.
Spatial Load Forecasting Methods
Non-Analytic Analytic
Manual Computerised Trending Multivariate
Single Area Multiple Area Landuse Extrapolation
Regression Decomposition Clustered Load TransferVacant Area
Figure 2:1 Spatial electric load forecasting methods
37
To predict future outcomes, the other two approaches utilize historical data. The
second approach (trending) is employed to analyze, and historical data have been
used to evaluate predictions [62].
2.4.1 Trending methods
The trending methods of forecasting in all sub-areas through the
extrapolationscheme to predict the peak load in every sub-area through the available
data. In general, the input data for these approaches take into account each small
area's historical demand and utilized different methods for approximate the loads in
vacant spaces. The server may also comprise information related to whether to
generate the electricity consumption curve. These strategies have appealed to
electrical distributors since they used limited databases to forecast transmission
feeders; however, because they do not understand inter-area relationships, these
techniques could provide little information for utility substation expansion planning
[63].
Salvó, G., & Piacquadio, M. N. (2017) presented a novel approach for estimating
the spatial increase in demand for electricity. The new focus adds value to but does
not substitute the traditional tools used by utilities. The consumer demand analysis
was split into two multi-fractal as a result of the approach applied. That being
surrounded by a suburban area of lower demand), with a border between them,
demonstrating that the seemingly random distribution of order has an internal
structure that can only be seen by multi-fractal research. The results obtained show
properties named stability and constant frontier dimensionality etc. Provided
measures in natural ways compared to multifractal proportions. The method could
also be strengthened from a geographical and demographic point of view by
evaluating urban-models, and land-use models have adjusted for the area on which
the utilities worked [64].
2.5 LOAD FORECASTING METHODS
Various forecasting methods are available in the research area included with varying
degrees of having been developed and discussed for prediction of power
consumption comprises of multiple linear regression, non-linear regression model,
38
and multivariate regression model. The types of forecasting have been discussed, the
approaches used to forecast medium and long term predictions are; (a) end-use
model, (b) econometric models, and (c) statistical model-based learning. The models
used for short term forecasting are; (a) similar day approach, (b) regression methods,
and (c) times series based. The artificial neural network included with the number of
schemes like backpropagation neural network (BPNN), particle swarm optimization
(PSO) dynamic artificial neural network, Elman artificial neural network, and Jordan
recurrent neural network have been applied. Simple autoregressive (AR),
autoregressive moving average (ARMA) and autoregressive integrated moving
average (ARIMA) models have been presented earlier. The present work focuses on
the methods that are best suitable for forecasting load consumption, i.e., ARIMA
and ANN. As per the presence of seasonal elements in the Time Series Data, an
ARIMA model has been utilized for prediction [65].
Al-Hamadi, H. M. (2011) have presented load forecasting of long-term prices using
fuzzy logic as a classification approach. A fuzzy model of linear regression has been
developed using factors affecting loads, i.e. loads from previous years, population,
and annual growth factors. In the fuzzy-based approach presented, the problem of
linear optimization has been developed to minimize the distribution of parameters
under fuzzy regression. Annual increases were calculated using cubic polynomials
for each of the long-term forecast factors. The results of this study showed that the
absolute error of the projected average daily load did not exceed 3.68% of the actual
load during the whole year. According to the results obtained, the proposed model
and forecasting technique used to gain a significant advantage over existing models
in order to reduce the average absolute error between the actual loads predicted over
a given period of time [66].
AlRashidi, M. R., & El-Naggar, K. M. (2010) introduced a novel yearly peak load
prediction approach in electrical power systems. To reduce the error associated with
the estimated parameters of the system, a particle swarm optimization (PSO) has
been proposed for long term forecasting. This work has been done through the latest
recorded data from Kuwaiti and Egyptian networks. The research has been
performed the task based on some criteria such as : model parameters estimation of
(a) Egyptian system (b) Kuwaiti network and (c) maximum load demand forecast of
39
Kuwaiti network. The average error generated by the designed model has been
analyzed and used to predict the load. Predictions using the method of PSO are
compared to those obtained using the technique of LES, and PSO predicts data with
better accuracy compared to the LES approach [67].
Al-Saba, T., & El-Amin, I. (1999) presented an Artificial Neural Network (ANN)
with a multilayer perceptron for long term forecasts in requirements of energy in
electricity service. In the long term, forecasting various models has involved time
series models named AR, ARMA, and ARIMA compared with the performance of
ANN. In this particular work, ANN composed single neurons in the output layer in
all cases; often, in the input layer, various neurons have involved. This varies the
number of neurons depends on the model of neural network (NN). Here the ANN
used the architecture of the backpropagation (BP) algorithmhas some features by
which the implementation through ANN becomes reliable. These are; the training of
BPhas designed to reduce the mean squared error (MSE) in between the required
output and values that are provided during training [68].
Chatfield, C. (2001) performed forecasting by using a nonlinear system's event
opens up the opportunity for constructive preparation and effectively accomplishing
the business objectives set. However, problem-solving and prediction of coming
events of nonlinear processes is a difficult task due to noise and non-stationary. The
time-series data is therefore chosen at regular intervals, which is a collection of
findings with a numerical attribute of an individual's characteristics. Thus, even for a
domain analyst, it is entirely unfeasible to understand the historical structure from
the past to make a proper decision. Like, it is difficult for a stock market expert to
recognize the rise and fall of a stock price reliably and correctly. The time-series or
historical data are therefore utilized to build a system and to evaluate the non-linear
system is still searches [69].
2.5.1 Literature review of studies using ARIMA or its Hybrid models for
forcasting
This model is becoming very famous, since its ability to deal with several kinds of
data like; stationary and non-stationary. Whereas the ARIMA model is pre-assumed
the relationship between historical and predicted data, and it works suitably for
40
linear time-series data but not provides relevant results in non-linear data. So the
single model is not sufficient to produce a better result in the forecasting of Time
Series Data; due to this reason, we applied here ARIMA with ANN hybrid algorithm
[70].
Dong et al. (2016) proposed a hybrid approach to forecasting future residential
consumption of energy that have done in two steps; first, predict non-AC
consumption of electricity by using past data since internal heat gain is highly
correlated with non-AC electricity consumption, heat convection and conduction can
be directly calculated from the non-AC prediction.Secondly, have expected weather
is inputted into the thermal network differential-algebraic equations (DAEs) in
combination with the internal heat gain forecast to simulate zone temperature. Then,
the temperature change in the measured area is modified to an AC regression model
with a set point plan. The AC cooling power consumption is also expected
afterward. Total electricity consumption by summarizing the AC and non-AC
forecasts.This work has been verified by using one month of data from four
residential buildings. Five other approaches ANN, SVM, LSSVM, GPM, and GMM,
have been compared based on the same inputs, and this hybrid approach acquired
efficient outcomes [71].
David et al. (2016) evaluated the quality in econometrics of a widely used
combination of two linear models, i.e., ARMA and GARCH (Generalized Auto-
Regressive Conditional Heteroskedasticity model) to provide probabilistic solar
irradiance forecasts. However, a recursive approximation of the model parameters
has been developed to provide a structure that can be easily implemented in an
operational context. Theproposed model, like other models based on machine
learning techniques, can reliably perform point predictions using only solar radiation
records. In contrast, it is more convenient to build a recursive ARMA-GARCH
model and provides additional information about the uncertainty of forecasts. The
ARMA-GARCH approach is an effective combination of models used with
confidence intervals to prepare very short-term forecasts of solar radiation. The
recursive ARMA model provides a simple and practical method for predicting
points. Using only the true value of solar radiation applied in this study, this
approach is superior to other statistical models [72].
41
Alsharif et al. (2019) proposed a seasonal- ARIMA model in Seoul, South Korea, to
forecast the solar radiation regularly, i.e., daily and monthly. The data has been
obtained by using the Korean Meteorological Administration from 1981 to 2017 of
37 years. The performance of the designed system has been tested based on the
fully and partially autocorrelation function for residuals, After that, root mean
square approach has been applied and the obtained results are compared to the
Monte Carlo simulations. Here the ARIMA (1,1,2) model is utilized to represent the
solar radiation daily. To indicate the monthly solar radiation used, ARIMA (4,1,1)
having 12 lags includes both AR and MA parts of the model and got the 176 – 377
Wh/m solar radiation monthly [73].
Al-Musaylh et al. (2018) focused on the data-driven technology for the short-term,
i.e., hourly forecasting, G-data has been adopted through the half, one and 24 hours
horizons of predicting. This proposed algorithm is based on some models, i.e.,
Multivariate Adaptive Regression Spline (MARS), SVM (Support vector machine),
and ARIMA This work is mainly focused in Queensland (Australia) in which
increases the load demand to consume for end-user. In short term forecasting, i.e.,
half and one hour horizons, the MARS model performs better as compared to the
SVM and ARIMA models along with the largest WI (Willmott's Index)and shortest
MAE. The accuracy of this work has been measured based on such parameters, i.e.,
RMSE, MAE, and relative RMSE [74].
2.5.2 ANN and ARIMA
The prediction for Time Series Data is a trending research area due to the reason it
plays a vital role in forecasting and deciding in various experimental areas. In this
work, the task is to enhance forecasting efficiency. From previous daysthese two
algorithms, i.e., ARIMA with ANN, have been extensively applied for the prediction
of Time Series Data [75].
Bedi, J., & Toshniwal, D. (2019) proposed a deep learning-based forecasting model
for the prediction of electricity by resolving long-term historical dependencies.
Initially, the cluster analysis is carried out on all-month data on electricity
consumption to produce segmented data based on the season. Subsequently, analysis
of the load pattern offers a more in-depth insight into the metadata that falls into
42
each cluster. Additionally, the comparison of performance has been made between
the proposed approach, the Recurrent Neural Network, SVM, and regression models
of the ANN. It has to be concluded that the proposed approach outperforms in
contrast to the SVM, ANN, and Recurrent Neural Network with the regression
modeland can be used to predict the demand for electricity effectively. This
presented model is fully scalable and promotes experiential learning means that the
moving window-based MIMO approach integrates statistical data with new findings
of real-time production to predict electricity consumption accurately [76].
Khashei, M., & Bijari, M. (2010) presented a new hybrid approach of ANN by
utilizing an ARIMA model to get a higher efficient scheme as compare to ANN. The
working of this particular scheme is based on the Box- Jenkins of linear modeling;
here, time-series is regarded as a non-linear function of various previous
observations and random errors. So, in the first stage, ARIMA is utilized to predict
the consumed required data after that decide a model to the ANN is then used to
define a framework to capture the essential process of data generation and to
forecast the future using pre-processed data. This proposed model has produced
more efficient results than the hybrid model of Zhang, and both ARIMA and ANN
models have been used separately over three different time intervals, i.e., 1 month, 6
months, and 12 months with both error measurements [77].
Babu, C. N., & Reddy, B. E. (2014) developed a novel hybrid ARIMA – ANN
model for the Time Series Data prediction. Various hybrids ARIMA – ANN models
have discussed in the state-of-the-art that applied an ARIMA model in Time Series
Data, taken care of the error between both the actual and the ARIMA data as a non-
linear element and design them through an ANN in various ways. While these
models provide predictions with greater accuracy than the individual models, there
is room for further improvement in efficiency when taking into account the
complexity of the given time series before implementing the models. The essence of
volatility was discussed in the work outlined in this paper using a moving average
filter, and then an ARIMA and an ANN model are applied accordingly. The
recommended hybrid ARIMA – ANN model has been implemented through a
hypothetical dataset and empirical data sets like sunspot data, data of electricity
price, and the stock market. The obtained outcome by using these datasets indicated
43
that the presented hybrid method is having higher predictive reliability for both one-
step and multi-step forecasts [78].
Khandelwal et al. (2015) presented the benefits of DWT in enhancing precise time
series forecasting. They have also suggested a novel forecasting technique by
separating a Time Series Data set through DWT into two components, such aslinear
and nonlinear. DWT is initially used to break down the time series in-sample
training dataset into two parts linear, i.e., detailed and non-linear, i.e., approximate.
After that, the time series models, namely; ARIMA and ANN, have utilized in the
recognition and prediction of these given two components, such as detailed and
approximate components reconstructed separately, respectively. In this way, the
proposed approach makes usage of DWT, ARIMA, and ANN's unique strengths to
improve predictability. This proposed work has been analyzed on four real-world
time series, and their predicted values have compared to the ARIMA, ANN, and the
hybrid models of Zhang. The obtained outcome indicated that the approach
proposed achieves the best predictability in each sequence [79].
Lee, W. J., & Hong, J. (2015) established a flexible and fluid time series hybrid
system for mid-term load prediction and then tested the quality of such a model
through implementing it to the real load data of the metropolitan area of Seoul,
South Korea with a standard dynamic model, the Koyck model, and the ARIMA
model. A quadratic resistance to air temperature was introduced by the proposed
hybrid model and the Koyck model. This hybrid model provided higher predictions
than the models Koyck and ARIMA. This presented hybrid model would
significantly reduce the forecasting error and its periodic variance and may be a
powerful tool for mid-term forecasting with measured air temperature data. The
main benefits of this hybrid model is that (a) it removes the need for statistical
methods of non-wetter determinants and (b) it could be effectively expanded to a
complex model through incorporating quantitative analysis of other independent
factors [80].
Rana, M., & Koprinska, I. (2016) presented advanced wavelet neural networks
(AWNN) to forecast short term load forecasting (VSTLF) predicted the
consumption of electricity from minutes to hours. Have investigated the ability of
44
decomposition and for feature selection and prediction used non-linear algorithms
using AWNN to obtain the more efficient forecasting model of VSTLE. To get a set
optimal frequency element that is used in the representation of data and separately
prediction of each component, a wavelet decomposition technique has been applied.
AWNN's precision was tested using two years of Australian electric charge data
estimated every five minutes and two years of Spanish data evaluated every 60
minutes [81].
Dudek, G (2016) implemented a univariate model for short-term load forecasting
based on linear regression and regular load time series trends. The amount of
determinants has been reduced to one that helps in the representation of regression
function through primary component regression or partial least-square regression.
Here the models approximate two factors, i.e., mean and variance by the
leastsquared method. Here significant benefits as compared to specific STLF models
based on ARIMA, exponential smoothing, neural and neuro-fuzzy networks, or
SVM, thatincluded dozens or hundreds of parameters, and their estimation involves
advanced methods of optimization. Specific STLF models used a similar pattern-
based approach and economic modeling previously: MLP and N–WE. Although
these models are nonlinear, they have better extrapolation properties for the
proposed linear models [82].
Barak, S., & Sadegh, S. S. (2016) estimated annual energy usage in Iran is based
on three ARIMA – ANFIS system patterns. These are (a) ARIMA model having
posses with four input data features along with six Adaptive Neuro-Fuzzy Inference
System (ANIFS) extracted features. The data is being clustered using C mean and
then used to train the system, which is further being used for forecasting. In the
second sequence, ARIMA's prediction is considered as input variables for ANFIS
prediction in addition to 4 input features. Therefore, in energy prediction with 6
different structures of ANFIS, four described inputs are used in addition to the
output of ARIMA. In the third approach, the second pattern is implemented by using
the information diversification model AdaBoost (Adaptive Boosting), and an
innovative ensemble technique is introduced due to data deficiency [83].
45
2.5.3 Wavelet Decomposition based ARIMA and ANN
Wavelet transformation is a decomposition-based approach utilized to split data into
low and high frequency components. Here, for the decomposition of time series
electricity data has been decomposed using WT method. The forecasting accuracy of
both ARIMA and ANN has enhanced by using wavelet transform in many areas
such as to forecast electrical price, stock price, consumption of short-term load, etc
[84].
Sun et al. (2019) presented a Seasonal and trend decomposition approach to
designed an electricity prediction model for monthly prediction. This method
produces the combined effect of STL and ARIMA models, only one model and
performance of this integrated model is compared with three models. These three
models are: ARIMA, SARIMA, and the third model is the product of these three
components (i) trend, (ii) seasonal, and (iii) random factors. The first use of the STL
model according to the electrical properties is the individualization of the electricity
consumption time series. It affects the seasonal, pattern, and random components
factorization of monthly electricity consumption. The adjustment of the
characteristics of the three components is considered overtime. Ultimately, in
reconfiguring the monthly electricity consumption forecast, the correct model is
selected to estimate the components [85].
Nury et al. (2017) presented an alternative temperature prediction approach by
combining the wavelet technique with the ARIMA model and the ANN on the peak
and lowest monthly temperature data. The model configuration and verification
efficiency are systematically analyzed, and the relative output is evaluated based on
the predictive potential of out-of-sample forecasts.The ability to predict and
reliability by using in the sample and out sample data, i.e., RMSE, has
been evaluated through the percentage of bias (PBIAS) and the consensus
coefficient for these two models. To train the ANN model here, the Levenberg–
Marquardt (LM) algorithm in the platform of MATLAB, because this algorithm is
efficient, fast, and accurate [86].
Pannakkong, W., & Huynh, V. N. (2017) developed the recent integrated model of
the ARIMA and ANN along with DWT.This analysis used DWT to decompose time
46
series to provide detail and approximation. Both are studied based on Zhang's
integrated scheme involving both the ARIMA and ANN paradigm for non-linear
extraction together with linear components. These two components must be
combined to get final production. This proposed model was tested on three datasets,
such as; Canadian lynx, British pound or, the US dollar's exchange rate, and the
sunspot of Wolves. Ultimately, the results obtained indicate that the made model
offered better accuracy as a comparison to the hybrid model of the ANN, ARIMA,
and Zhang. Quality was calculated using three validated time-series datasets, and the
parameters used are MSE, MAE, and MAPE [87].
47
CHAPTER 3: METHODOLOGY
This chapter focuses on the methodology proposed for forecasting the electricity
consumption based on hybridization of ARIMA, DWT, Cuckoo search and ANN.
The chapter will discuss the proposed methodology in detail.
3.1 METHODOLOGY OF THE PROPOSED HYBRID FORECASTING
MODEL
In this research work, the main objective is to estimate the consumption of
electricity in the future in Punjab state. The required datasets for the research have
been taken from PSPCL, India. The step followed to achieve the objectives of the
proposed work are as follows:
Step 1: The present study utilizes original data from Punjab State Power
Corporation Limited (PSPCL).
Step 2: Discrete Wavelet Transform (DWT) which is a discretized continuous
wavelet transform (CWT) that breaks the Time Series Data into an integer number
of data samples is applied. The obtained results are similar in number as of original
data before decomposition by which it produces more accurate outcomes.
Step 3: Further decomposition of DWT decomposes the complete range of
electricity consumption into four distinct categories with different ranges
i.e., (LL), (LH), (HH), (LH).
These decomposed components are helpful in knowing the highest as well as lowest
electricity consumption in Punjab state as per the datasets of PSPCL. The
decomposition is done by using the Low Pass Filter (LPF) and High Pass Filter
(HPF) as explained below:
Low Pass Filter (LPF): is used to decompose the entire range of frequencies
into a lower frequency. LPF is utilized to attenuate signals having frequency
above than the cut-off frequency by allowing lower frequencies to pass
through the filter.
48
High Pass Filter (HPF): This filter attenuates signals with a frequency
lower than the cut-off frequency by allowing higher frequencies to pass
through the filter.
Step 4: ARIMA model is applied individually on obtained decomposed frequencies
i.e. low as well as high frequency, to get the Time Series Data. The frequencies
obtained are in four categories i.e. LL, LH, HL and HH. Therefore we apply
ARIMA model on all four frequencies. This model is depicted in figure 3.1.
ARIMA1 model is applied for the first category (LL) of electricity consumption,
ARIMA2 model is applied to values corresponding to second decomposition (LH).
The third model i.e. ARIMA3 is applied to the third decomposed category (HL), and
finally, fourth model ARIMA4 is applied to the last decomposed component (HH).
Figure 3.3:1 Proposed Work
Hybrid ARIMA Model
Original Data from PSPCL
Apply DWT
LL LH HL HH
ARIMA1 ARIMA2 ARIMA 3 ARIMA 4
Apply I DWT
Optimize data using CS
Model Forecasting using ANN
49
Step 5: After applying ARIMA models individually results are combined, and
Moving Average and Autoregressive analysis is performed by using integrated (I)
element of ARIMA to obtained Time Series Data.
Step 6: After that, Inverse (I) DWT is applied to the outcomes of these distinct four
ARIMA models to combine these. It builds a well-recognized record that is helpful
in providing the training of the proposed model.
Step 7: At the classification phase, the ANN classifier requires higher uniqueness as
well as accurate data for their training. To get the best record of the previous
month’s electricity consumption, CS optimization strategy with novel fitness
function, is applied.
Step 8: If the uniqueness of the data record is higher, then the training of the system
becomes easier and fast. So, this unique data record obtained through CS
optimization is transferred to the classification phase as an input of the training set to
ANN. In the end, ANN is used to train the proposed model by which it becomes
helpful to forecast the consumption for upcoming month on the basis of previous
time series values of electricity consumption.
3.2 PROGRAMMING LANGUAGE
The decision about which programming language deploy was between two classes,
i.e. First segment was that of programming languages like C++, Matlab and R that
are more established and generally utilized and Second segment was of new Java
based programming languages like WEKA and RapidMiner. In spite of the fact that
MATLAB is not as fast as C++, and the open source quality, of other up-and-comer
programming languages, it was the most ideal decision. It is the most widely used
tool and provides a lot of potential possibilities and promptly accessible code. The
other advantage is the best harmony between the language complexity and easiness
to use since the user becomes capable of seeing the details on a low level and there
is no need to invest a lot of energy in developing the most frequently used code
snippets and data structures. Also, parallelization and the conservativeness of the
code as far as vectorization is concerned helps in decreasing the runtime [88].
50
CHAPTER 4: PROPOSED WORK
This chapter provides an overview of the forecasting algorithms used to develop the
novel hybrid forecasting model. The research activities involved to achieve each
objective are thoroughly discussed. Section 4.1 discusses the forecasting algorithms
used to develop the proposed hybrid model. All the succeeding sections including
Section 4.2 onwards are discussing the research activities involved to achieve
objective 1 to 4 in detail.
4.1 ALGORITHMS USED TO DEVELOP THE PROPOSED HYBRID
FORECASTING MODEL
This section briefly discusses the algorithms used to develope the proposed hybrid
forecasting model. Time series forecasting has been the trending area of research
since past few years. There is a wide range of time-series forecasting algorithms
available and numerous researchers have been trying to find the means to increase
the prediction accuracy by developing new and hybrid algorithms. In present
research a novel hybrid forecasting model has been developed using well-known
algorithms of different classes i.e. Auto Regressive Integrated Moving Average
(ARIMA), Artificial Neural Network (ANN), Discrete Wavelet Transform (DWT)
and Cuckoo Search (CS) [89]. In the following section all these algorithms are
explained in detail.
4.1.1 ARIMA model
The ARIMA model is a scientific tool utilized to resolve the prediction problem in
forecasting of time series. Implementation of this model is easy and flexible as it
needs only past observation of the required variables. This model is Linear modeling
scheme and is a combination of three components, such as AR, I, and MA which are
briefly described below:
Auto Regression (AR) is defined as the model that uses the observer-
dependent relationship and a few lagged observation numbers.
51
Integrated (I) to compute the differencing of raw observations, for example,
subtraction of representation from the previous view to make the stationary
time series.
Moving Average (MA) utilizes the dependency among observation and a
residual error of a moving average model that is applicable for lagged
observations.
ARIMA model was firstly deployed in 1976 by Box and Jenkin by using MATLAB
R2012a; Model was proved useful for the preparation of data and computation of
autocorrelation function(ACF) and partial autocorrelation function (PACF). The
above three main components of the ARIMA model are clearly explained in terms of
the parameter of this model. A classical notation is used for ARIMA is
ARIMA (p, d, q) in which the parameters are assigned with integral values, and the
notation used are illustrated as follows:
p: represents the observed values for lag involved in the prediction model.
d: denotes the amount of time are obtained after the difference between raw
views, also known as“degree of variation”.
q: is the moving window dimensions and is known as “degree of moving
average”.
The ARIMA model was designed by combining AR and MA models, which are
being performed by Jenkins. Mathematically, it can be written by equation 4.1.
zt=K+ϑ1 zt-1+ϑ2 zt-2+⋯+ϑm zt-m +nt (4.1)
Where ……. intercept term at the first, second, and the last position
respectively.
k Constant
White Gaussian Noise
The analyzed values using equation (4.1) are often used for the prediction of the next
data. However, MA model can be mathematically presented by equation (4.2).
(4.2)
52
MA represents the regression, which represents the lagged error value. For the data
decomposition, MA filter helps and the equation for fth order of MA filter is written
by equation (4.3).
(4.3)
Where, f=2i+1. As a result, time series (t) is obtained with an average period of ‘i’
to obtain the trend period. The obtained observation values appear to be close to the
random exclusion value. However, M2 is a harmful component obtained in the form
of . In equation (4.3), M2 represents the secret part, M1 signifies the
inclined part, and M represents the original series [90]. The result from the MA filter
can be combined with an AR output to form the ARIMA model. Mathematically, it
is represented as follows:
(4.4)
4.1.2 DISCRETE WAVELET TRANSFORM (DWT)
The second model used for proposed hybrid model is Discrete Wavelet
Transform(DWT). DWT is a type of wavelet transformation, which decomposes a
signal into an essential orthogonal function of different frequencies. The main
feature of DWT is that it is totally lossless transformation. We can regain our
original signal while using reverse DWT. The basic orthogonal functions are
localized in space, which is only a fraction of the total signal length. The DWT wave
function is an expanded, translated, and scalar version of a common function also
known as the mother wave [91].
To decompose any non-stationary signal such as an image, audio, or video signal,
the discrete wavelet transform is introduced. The signal transmission is always
dependent on the minimal waves. DWT differentiates the signal into two major sub-
bands of frequencies, namely, higher late frequency and lower rate frequency. The
information of the edge portions is present in the high-frequency sections, whereas
the low frequency further decomposes into higher and lower frequencies. The water
marking process is generally done on high frequency [92].
53
To achieve decomposition at every point in two-dimensional systems, DWT is
applied in both the vertical and horizontal direction, as shown in Figure 4.1.
L H
Figure 4.4:1 Horizontal Wavelet transforms [93]
As shown in figure 4.1, the wavelets are decomposed horizontally into two sub-
bands namely, L and H. Further, horizontal decomposition is applied on these
obtained wavelets as shown in figure 4.2.
L H
LL1
LH1
HL1
HH1
Figure 4. 4:2 Vertical Wavelet transforms for horizontal wavelets [93]
Four sub-bands, such as LL, HL, LH, and HH exist at the starting level of
decomposition. For each next level of decomposition, the LL sub-band of the
previous level is used as the input. In the second level of decomposition, the LL1
band splits into further four sub-bands named as LL2, HL2, LH2, and HH2.
54
LL1
LH1
HL1
HH1
Original Wavelet signal
LL3
LH3
HL3
HH3
LH2
HL2
HH2
LH1
HL1
HH1
LL2
LH2
HL2
HH2
LH1
HL1
HH1
1d - DWT
2d-DWT
3d - DWT
Figure 4. 4:3 1, 2 and 3-level Discrete Wavelet Decompositions [93]
As shown in figure 4.3, the third level of decomposition, LL2 splits further into four
sub-bands- LL3, HL3, LH3, and HH3. Thus a total of ten sub-bands are obtained.
LL3 contains the lowest frequency sub-band while LH1, HL1, and HH1 contain the
highest frequencies sub-bands. In the areas of image processing, concept of multi-
resolution is present to compress the Wavelet transform. For the de-noising and
compression of image Wavelet transform is used. Wavelet transform is a well-
organized tool to signify required data. The wavelet transform allows multi-
resolution analysis to extract relevant data from a large number of datasets [93].
4.1.3 CUCKOO SEARCH OPTIMIZATION ALGORITHM
Cuckoo Search(CS) algorithm is the third algorithm used for hybrid proposed
algorithm. It a nature-inspired algorithm which is motivated by the reproduction and
egg-laying behavior of cuckoo birds. Cuckoo Search is mainly inspired by the
55
development of cuckoo behavior, i.e., color pattern laying imitated eggs in other
bird's nests. The egg from the nest represents a solution, whereas a cuckoo egg
stands for the obtained solution. The use of the new and better solutions (i.e.,
cuckoos) is then to move the worst solution within the nests, which is this
algorithm's primary target. CS process consists of two components, i.e., local and
global. CS carries out two key features of today's meta-heuristic calculation:
Increase and Expansion. The former is introduced to enhance the best alternative
through a guided random walk, while the latter is designed to maintain population
diversity through Levy flights. The probability (Pa) of changing regulates a balance
between the two stochastic search parts. The algorithm for the Cuckoo Search is as
follows:-
Algorithm: Cuckoo Search (CS)
Initialize the Population of 'N' Nests.
Evaluate Nests.
While (Termination Criteria Not Met)
Randomly Generate New Solution from Best Nest.
Randomly Choose Nest from Population.
If is better than Replace with
Abandon Worse Nests, Replace with Randomly Generated Nests.
This algorithm serves as the base for the desired strategy to solve the problem of
global optimization by making balance among both random walks i.e., local and
global. This balance is managed on the basis of defined sufficient probability
.
× J ( ) ×( ) (4.5)
(4.6)
56
The mathematical representation corresponds to local along with global random
walk and is as shown above in equation (4.5) and equation (4.6) along with Table
4.1 which represents the parameters of Local and Global Walk [94].
Table 4.1 Parameters of Local and Global random walk
Parameters Detail
and The present location was chosen by random permutation
+ve Step size scaling factor
Subsequent location
R Step Size
× Product of two vectors on the basis of entry
J Heavy side variable
Employed to switch between local and global walk
A random number from homogeneous distribution
Levy distribution utilized for a selection of step size of random walk
4.1.4 ARTIFICIAL NEURAL NETWORK
The last technique used for hybridization of the proposed algorithm is Artificial
Neural Network (ANN). ANN is an efficient and successful alternative to ARIMA
models for predicting the time series relation with distinctive features. This
technique helps to increase the prediction accuracy of the designed model [95]. In
this research, a single hidden layer ANN with single output is used. The output of
ANN can be defined as
(4.7)
Where,
(j=0,1,2,3,4,…………….n) (4.8)
(i=0,1,2,3,4,…………….m), both are the weight ,
and Bias value
57
White Noise
h hidden layer activation function of hidden layer
The algorithm for ANN is shown below. The important steps to train the network are
defined below:
Algorithm: Artificial Neural Network (ANN)
1 Start
2 Initialize ANN and define the basic feature as input/training data (T-Data),
Target (TR) and Neurons (N)
3 Set, Model-Net = Newff (T-Data, TR, N)
4 Model -Net.TrainParam.Epoch = 1000
5 Model -Net.Ratio.Training = 70%
6 Model -Net.Ratio.Testing = 15%
7 Model -Net.Ratio.Validation = 15%
8 Model -Net = Train (Model -Net, T-Data, TR)
9 Current Data = Feature of real-time data
10 Prediction = simulate (Model -Net, Current Data)
11 If Prediction = True
12 Results = Show predicted data
13 End
14 Return: Results in terms of prediction
15 End
Hence, the proposed selection of forecasting algorithms for the hybrid forecasting
model consists of the following:
1. Auto Regressive Integrated Moving Average (ARIMA) model
2. Discrete Wavelet Transform (DWT)
3. Cuckoo Search (CS) Optimization algorithm
4. Artificial Neural Network (ANN)
58
The chapter is further divided into sections where each following section discusses
the objectives to be followed for the present study. The specific objectives of this
research are as follows :
1. To model the time series data and forecast future values using ARIMA.
2. To determine how discrete wavelet transform resolves the difficulties in
ARIMA modeling.
3. To develop a hybrid model of ARIMA and wavelet transform.
4. To analyze the performance and accuracy of the proposed algorithm.
The above mentioned objectives are discussed in detail in the following sections.
4.2 TO MODEL THE TIME SERIES DATA AND FORECAST FUTURE
VALUES USING ARIMA
In order to achieve the first objective time series data of electricity consumption
which has been collected from PSPCL, Punjab, India is trasfered to ARIMA model
to generate the forecasts of electricity consumption five year ahead from year 2018
to year 2022. The steps mainly involved in the prediction are discussed below;
1. Stationarity check: First, it has to check the given data is stationary or not;
if it is stationary, then the data is directly transferred to the next step. If it is not
stationary, then it's mandatory to perform differencing operation and then it is
checked for stationarity. If the data obtained is still unstable, the distinction is made
continuously unless the data obtained becomes stable. If the difference is made d
times, the integration rule for the ARIMA model is d [96].
59
(a) (b)
Figure 4.4:4 (a) Stationary and (b) Non-stationary series
Fig source: https://www.analyticsvidhya.com/blog/2018/09/non-stationary-time-
series-python/
2. ARMA modeling: Stationary data is transferred to the ARMA time series
model as follows. Assume that the value of the data at any time t is and the
previous p data values are , and the errors in the given time
periods are assumed to be .
Corresponding to this ARMA equation is given in (3.5) below;
( 4.9)
In given equation (4.9), and denotes the coefficients of
autoregressive (AR) and moving average (MA), so the time series model is
expressed as . The procedure of ARIMA modeling is as follows;
(a) Identify the order of model : based on correlation analysis through the
behavior of autocorrelation function (ACF) and partial autocorrelation function
60
(PACF) as discussed in equation (4.2) and (4.3) which denotes the functions of lag
or delay i.e., . Here three cases are included in the first case;
i. If, the sinusoidal delay is represented by ACF, then the PACF approaches to
zero after a lag of (p), then the model produces pure AR process with order .
ii. If, the ACF is zero, then the PACF shows a sinusoidal decay, and the model is
known as the MA model of order q.
iii. If, ACF and PACF consists of sinusoidal decay as well the values of both
approaches to zero after a lag,
then the model is called ARMA process having the order of and ARIMA
model includes an order of
(b) Estimation of model coefficients: Through the Box-Jenkins scheme, the
coefficients of the model could be calculated. Gaussian maximum likelihood
estimation (GMLE) methods are usually utilized to estimate the variables of the
ARIMA model. After the estimation of data from the time series, the model must be
validated. This test for diagnosis is based on the sequence of error analysis. The
model can be tested by evaluating the ACF of the data from this error series and
checking whether they are in the 99% confidence interval. Some other tests can be
performed without the use of residual ACF to validate the model. Another test
scheme is the Ljung and Box test, and there are many parameters for estimating the
accuracy of the method, including the Akakine Information Criteria (AIC) and the
Bayesian Data Criteria (BIC) [96].
3. Data forecasting: It is used to evaluate Time Series Data after the reliability
of the model has been confirmed. The following Time Series Data values are
calculated by utilizing all expected model parameters and available Time Series
Data values. Different data must be combined to return raw data forecasts. Thus, this
model is called “Auto-Regressive Integrated Motion”(ARIMA) and is used in linear
Time Series Data forecasting with improved linear accuracy; [97].
61
4.3 TO DETERMINE HOW DISCRETE WAVELET TRANSFORM
RESOLVES THE DIFFICULTIES IN ARIMA MODELING
In the previous section, the working and overall architecture of the time series and
ARIMA model was discussed. This section discusses about how DWT resolves the
difficulties of ARIMA model while performing the forecasting. A forecasting model
is developed by combining two techniques ARIMA and DWT and its accuracy is
compared with forecasting model developed using ARIMA model. The difficulties
of ARIMA model while forecasting and how DWT resolves these difficulties is
discussed in detail as follows:
4.3.1 Difficulties that were resolved by combining ARIMA model with DWT
Difficulty 1: ARIMA do not involve decomposition technique
Explanation: The main aim of analyzing time series data is to establish a forecast
model, which can be able to predict future values based on the past experience. Due
to the difficulty of evaluating the particular nature of a time series data, generating
adequate forecasts is often considered challenging. Various predictive models from
ARIMA and ANN have been introduced in existing works. ARIMA forecasting
model is well-known for its remarkable predictive accuracy and flexibility in
representing various types of time series. However, a significant limitation is the
probable linear form of the related data, which makes this shortcoming of ARIMA
models unsuitable for complex nonlinear time series modeling. To overcome this
difficulty, DWT is required to divide a large amount of time series dataset into two
sub-parts; detailed (linear) as well as approximate (non-linear) [79].
Difficulty 2: Noise from datasets cannot be removed by the ARIMA model
Explanation: Wavelet analysis can filter noisy signals i.e. identify the trend of
variation and the fluctuation of data from the time series. Wavelet decomposition
and reconstruction reduces the time series data non-stationary and thus improves the
prediction accuracy. The decomposition of the wavelet is applied as a de-noising
technique [98].
Difficulty 3: The cyclicality of the time series data cannot be reduced by ARIMA
62
Explanation: The cyclicality of the time series data can be reduced through
decomposing the data into high-frequency data segment and low-frequency data
segment [99] because the frequency is the rate by which the data can be modified.
The high-frequency data segment changes rapidly according to the time frame,
whereas the low-frequency data segment is not changing rapidly as per time.
Difficulty 4: ARIMA model produces limited accuracy in the small dataset.
Explanation: The limitation of ARIMA is to produce accurate forecasting for a short
time period. It also has the drawback of the ARIMA model that it needs a minimum
of 50 and preferably 100 observations or higher than this [97]. Whereas DWT does
not need a large number of datasets, it can model the individual stationary process as
well as components [100].
Difficulty 5: There is no automatic updating feature. As new data become available,
the entire modeling procedure must be repeated. Different models are needed to be
built from scratch for a new dataset in ARIMA
Explanation: There is no need to build the forecasting model for the updated new
dataset from scratch in DWT, unlike ARIMA modeling [101].
Difficulty 6: Estimation of parameters of p, d, q is a time-consuming process in
ARIMA (p, d, q).
Explanation: An ARIMA (p, d, q) forecasting model is necessary to estimate the
parameters (p, d, q) opposite from this DWT does not require any kind of parameter
estimation [102].
Difficulty 7: Extreme variations and fluctuations occur with high frequency in the
ARIMA model.
Explanation: The extreme variations, as well as fluctuations that occur with high
frequency, tend to the increased risk of error and information loss while performing
forecasting for future data. By decomposing the time series data into low as well as
high-frequency components by utilizing DWT, it makes it possible to recover the
original time-domain signal without losing information [103].
63
Time Series Data
Discrete Wavelet Transform
Lower limit (LL) Upper limit (UL)
Auto Regressive and Moving Average
Forecasted Value
Figure 4.5 Forecasting model developed by combining ARIMA and DWT
This forecasting model has been made by combination of Wavelets such as HAAR
and Daubechies with ARIMA model. Initially, the wavelet models analyzes the time
series data, and the output of this phase passes to the second phase i.e. ARIMA
model to convert nonlinear data into linear data as shown in figure 4.5.
4.3.2 Use of DWT in proposed hybrid model
Time period and frequency domain series are ideal for non-stationary forecasting
processes where the feature and mean of autocorrelation is not stable over time. The
electricity data is non-stationary as its consumption varies continuously over time.
Therefore, DWT is the best way to express this type of data.Wavelet function that
has been created from a single input is given by equation (4.10).
(4.10)
c Scaling factor utilized to calculate the compression value
64
d translation parameter to compute the location of the wave time
Here, one out of the following two conditions can be true i.e. If | c | <1, then the
wave is a compressed version that is connected to higher frequencies (multiple time
cycles). On the other hand, if | c |> 1, then the time band of this function is
greater than , which is directly related to low frequencies.
Thus, DWT is considered as a time series analysis tool, which computes time series
data using wave and generates a discrete signal. It is based on the coding of a sub-
band and the rapid computational process of wave conversion. The discrete wavelet
function can be expressed by equation (4.11).
(4.11)
4.3.3 Use of ARIMA model in proposed work
The ARIMA model was designed by combining AR and MA models, which are
being performed by Jenkins. Mathematically, it can be written by equation 4.12.
Zt=K+ϑ1 zt-1+ϑ2 zt-2+⋯+ϑm zt-m+nt (4.12)
Where ……. intercept term at the first, second, and the last position
respectively.
k Constant
White Gaussian Noise
The analyzed values using equation (4.12) are often used for the prediction of the
next data. However, MA, the model can be mathematically presented by equation
(4.13).
(4.13)
MA represents the regression, which represents the lagged error value. For the data
decomposition, MA filter helps and the equation for fth order of MA filter is written
by equation (4.14).
65
(4.14)
Where, f=2i+1. As a result, time series (t) was obtained with an average period of I
to obtain the trend period. The obtained observation values appear to be close to the
random exclusion value. However, M2 is a harmful component obtained in the form
of .. In equation (4.15), M2 represents the secreted part, M1 signifies the
inclined part, and M represents the original series [104]. The result from the MA
filter was combined with an AR output to form the ARIMA model.Mathematically,
it is represented as follows:
(4.15)
The performance parameters of the ARIMA model are represented as
.
4.3.4 DWT and ARIMA Model
Discrete Wavelet
Transform
Lower limit (LL)(a1,a2,a3,….an)
Upper limit (UL)(b1,b2,b3,…. bn)
ARIMA
Figure 4.4:5 DWT and ARIMA Model
Figure 4.7 shows a Wavelet analysis of time series data in which the previous year,
data related to the Punjab electrical board was taken as an input ‘Si’ signal. The
lower pass filter passes lower signal values by blocking the high frequency signals.
66
Autoregression represented by AR model and the moving average by MA model,
and ‘d’ signifies the lag value from previous years.
For predicting future values based on past experience, three types of processes are
followed:
(a) AR process
(b) Differencing in value
(c) MA part
This ARIMA model has been used to forecast energy consumption. However, the
developed model is used for various applications to forecast weather data, climatic
conditions, and so on.
Input(Si)
Lower Pass Filter
High Pass Filter
Down Sampling of lower values
Down Sampling of higher values
Lower range (LL)
Higher range (UL)
Figure 4.4:6 Discrete Wavelet Transform Output
The products from the filters are fed for further reduction of data into a small size.
The upper block converts the lower values to different attributes such as a1, a2,
a3,………an. Similarly, the higher block converts the higher values to attributes like
b1, b2, b3………bn. Thus, the lower and upper rows of linear time series data are
briefly adjusted. The linear time series data obtained during DWT output was fed to
the ARIMA model. The ARIMA model was applied in the following steps:
1. The raw time series data is applied to the DWT unit. DWT processes the
input signal to set the predicted value range.
67
2. The coefficients b1, b2, b3………bn. , represents the upper limit of the time
series and the coefficients a1, a2, a3,………an depicts the lower limit of the
TSD.
3. DWT function obtained as is directly fed to the ARIMA model.
4. The regression value is calculated from Equation 4.12 and the MA value is
calculated from Equation 4.13. These equations calculate AR (m) and MA
(r). In addition, the MA filter changes the trend part and removes the part
from the original TSD information. The filtered values have been determined
using equation 4.14.
5. The MA and generated regression values are analyzed using the ARIMA
model.
6. Finally, compute the forecasted values
7. Compute the performance parameters.
4.3.5 HAAR Wavelet
The main properties of a wave transformation are that it converts a non-stationary
time series data into a fixed time series relative to the original sequence. It is formed
by connecting signals orthogonally. The HAAR wave is derived from a group of
functions and is a single rectangular wave that supports the domain in the range
A∈ [0,1].
Because of and the (ω) in the point ω =0 and
only has one degree zero. determines the family of simplest orthogonal
normalized wavelet family in the multi-resolution system i, i.e., not only
orthogonal to 2i A) but also perpendicular to the integer displacement of their
own data Mathematically, the HAAR wavelet
function can be described in equation(4.16) below:
68
(4.16)
The HAAR wavelet is equal to 1 in the range of [0, 0.5] and for other intervals
[0.5,1] it is 0. The original data affects the function value of the HAAR wave; the
value of the sequence data affects the similar absolute value of +ve and -ve function
values, these steps can be substituted for frequent repetitions of the original data in a
very short period of time, whereas the data will not have frequent shocks [105]. This
is the principle of using the HAAR wave to reduce data in a time series.
4.3.6 The requirement of DWT in time series forecasting
By using DWT, the boundary i.e., upper as well as lower bound of datasets, can be
easily determined. Wavelet decomposition is combined with time series models as a
pre-processing technique to decompose the datasets. Wavelet decomposition breaks
down time-series data into approximation and components so that different
forecasting models can be applied to every component, and it also separates data
into various series. Wavelet decomposition shows better forecasting performance
after decomposition i.e. it breaks down data through wavelet decomposition and
obtain four decomposed series that have to be predicted by using ARIMA. After
applying the ARIMA model, the inverse wavelet transform combines the
decomposed value. After studying various papers, the DWT decomposition
mechanism is helpful in various areas of forecasting, like oil price prediction, stock
price estimation, wind speed prediction, and load price forecasting. The wavelet
decomposition properties extract low and high-frequency components from the
original data and allow each component to be analyzed easily. ARIMA, GARCH,
and wavelet decomposition are linear models, and ANN is a non-linear model. The
forecasting power has been rising consequently by applying the combination of
these models [106]. Due to this reason in this research work, the outcomes have
been analyzed in three phases; firstly, by applying only the ARIMA forecasting
model, then ARIMA is integrated with DWT and finally the proposed hybrid model
that is the combination of ARIMA, DWT, CS and ANN. By applying the DWT with
ARIMA, the performance of forecasting has been enhanced as compared to only the
ARIMA model. The decomposition property of DWT can enhance the performance
69
of forecasting; the validity of this approach has been proved through related existing
work in table 4.2.
Table 4.2 Existing DWT based work
References Proposed Work
[107]
The future market prediction has been performed using a wavelet
decomposition approach. The future prediction has been performed
based upon the prior values of the West Texas Intermediate spot
market. The test results show a better relationship between the
forecasted price and the actual price.
[108]
The researchers have designed a forecasting model for predicting
electricity prices using ARIMA and GARCH. As all three models are
linear models and the results depict better relationships among the
forecasting values.
[109]
Wavelet decomposition along with the ANN approach has been used
for predicting solar radiation from the year 1981 to 2001. The
proposed work is better compared to the existing state of art
approaches.
[110]
Wavelet decomposition, wavelet packet decomposition, and ANN
have been used to predict the speed of the wind. The designed model
has been compared with the existing ARIMA, ARIMA with ANN,
and Neuro-Fuzzy models, and the wavelet packet Broyden-Fletcher-
Goldfarb-Shanno provided better forecasting results.
[111]
Implemented the wavelet decomposition, decomposition of the
wavelet packet, and the neural network has been applied to predict
the wind speed. The outcomes have been analyzed by using ARIMA,
ARIMA with ANN, and Neuro-fuzzy approach has been contrasted
with existing models. The presented wavelet-based packet broyden
Fletcher Goldfarb shanno produces enhanced forecasting outcomes.
[112]
In this work, integrated wavelet decomposition along with the neural
network to estimate the Mackey glass time series and sunspot data.
The outcomes of prediction depicted an enhanced accuracy as a
70
contrast to previous models.
[113]
Have implemented the fuzzy wavelet decomposition for prediction of
IBM daily prices, daily index values. In this work, wavelet
decomposition produces better results when the noise has been
removed means when de-noising has been applied.
[114]
In this work, the authors used the wavelet transform, along with the
ARIMA forecasting model and radial basis function neural network
(RBFN) for prediction of electricity price. The price behavior of
electricity has been treated as a non-linear function that needs a non-
linear model to record the behaviors of price. The decomposed of
price data has been done by using wavelet decomposition into four
sub-parts. These decomposed series has recognized through different
models of ARIMA. After that, the inverse wavelet transform has
applied, and the RBFN network used to verify the errors of the
wavelet-ARIMA predictor. After analyzed the performance of this
work with the latest price prediction approach, the current work
provided significant improvement.
[115]
An adaptive wavelet neural network has been used for short-term
(ST) price forecasting in the market of electricity. The proposed
model has been first introduced as an alternative for a traditional
FFBPNN to estimate arbitrary non-linear functions. In this work, by
using the wavelet-based neural network, the performance has been
enhanced.
As given in the table, the performance of work has been enhanced by using a
decomposition approach combining with different forecasting models. Therefore, it
has to be concluded on the basis of the above discussion that the DWT technique
minimizes the error as well as increases the performance of work. After the DWT
approach, lower as well as the upper bound of the related dataset has been obtained
and hence becomes easy for the researchers to estimate the future electricity
consumption. DWT also helps to pre-processed large amount of datasets without the
filtering approaches.
71
4.4 TO DEVELOP A HYBRID MODEL OF ARIMA AND WAVELET
TRANSFORM
In order to estimate the consumption of electricity in future the present work is
performed using following steps. The datasets used for this purpose are obtained
from PSPCL, India. The flowcharts along with proposed work methodology in
described in the steps provided below;
Step 1: This research utilizes data obtained from Punjab State Power Corporation
Limited (PSPCL).
Step 2: In second step Discrete Wavelet Transform (DWT) is applied. DWT is a
discretized continuous wavelet transform (CWT) that breaks down the time series
data into an integer number of data samples.
Step 3: DWT decomposes the complete range of electricity consumption into four
distinct categories, including with different ranges i.e., (LL),
(LH), (HH), (LH).
Step 4: ARIMA model is applied individually on obtained decomposed low as well
as high frequency to get the time series data.
Step 5: After applying ARIMA models individually, step 5 combines Moving
Average and Autoregressive analysis using integrated (I) element of ARIMA to the
time series data.
Step 6: Applies Inverse (I) DWT to the outcomes of these distinct four ARIMA
models to combine them. It builds a well-recognized record that is helpful in
providing the training of the proposed model.
Step 7: At the classification phase, the ANN classifier requires higher uniqueness as
well as accurate data for their training. To get the best record of the previous
month’s electricity consumption, CS optimization strategy including novel fitness
function is applied.
72
Step 8: If the uniqueness of the data record is higher, then the training of the system
becomes easier and fast. So, this unique data record obtained through CS
optimization is transferred to the classification phase i.e., ANN as an input of the
training set. In the end, ANN is used to train the proposed model by which it
becomes helpful to forecast the upcoming month on the basis of electricity
consumption.
4.5 TO ANALYZE THE PERFORMANCE AND ACCURACY OF THE
PROPOSED ALGORITHM
In order to achieve the last objective i.e. analysis of the performance of the proposed
work all the experiment has been conducted in MATLAB simulator by using the
real-time data obtained from PSPCL. The actual datasets have been taken from the
previous year's electricity consumption. These real datasets have been decomposed
into four ranges of frequencies, such as (Lmin to Lmax, Lmax to Hmin, Hmin to Hmax,
Lmax to Hmax) by utilizing DWT. To forecast the future consumption of electricity,
the actual datasets need to be converted into time-series data (TSD). For this
purpose, the ARIMA model is utilized individually, corresponding to these
decomposed datasets. The main aim of decomposing the datasets is to get the
information about maximum and minimum utilization of electricity from previous
years. The obtained time series data by applying the ARIMA model is composed of
a number of information about their samples like linear as well as non-linear
information. From previous research, there are variety of approaches such as
Artificial intelligence (AI) based methods like SVM and ANN that have been
proposed. The forecasting model designed using these techniques have been
discussed in chapter 2 to estimate the demand for optimal electricity. As per the
existing work, they have been utilized for prediction of electricity consumption in
the future. But in present work, instead of utilizing these techniques separately, a
hybrid model has been proposed for prediction of electricity consumption in
upcoming days. From the experiments as discussed in Chapter 5, it has been
concluded that the accuracy obtained after the implementation of different
approaches is as shown below.
i. Forecasting Accuracy of ARIMA model is 83.53%
73
ii. Forecasting accuracy of Hybrid model of ARIMA and Daubechies Wavelet
transform is 92.67%
iii. Forecasting Hybrid model of ARIMA and HAAR Wavelet transform is
93.76%
iv. Forcasting accuracy of Proposed Hybrid model is 98.86 %.
74
CHAPTER 5: RESULTS AND ANALYSIS
This research work focuses on the prediction of electricity consumption in specific
industrial region of Punjab, India. To forecast the amount of electricity consumption
in the future, actual electricity consumed by consumers from the previous year has
been utilized. This dataset of electricity consumption is taken from the power system
of the public utility company Punjab State Power Corporation Limited (PSPCL)
from year January 2013 to December 2017. The detailed explanation of datasets has
been provided below in tabular form as well as in the form of a graph.
Figure 5:1 Dataset consumed electricity from January 2013 to December 2017
As per the given table 5.1 and figure 5.1, it can be said that increasing year leads to
an increased amount of electricity. The graph shows the lowest electricity
consumption in February and March 2013 along with the highest consumption of
electricity in July 2017. In the figure, the Y-axis represents the consumed electricity
by consumers which corresponds to different years. The X-axis depicts the years
from January 2013 to December 2017. The average consumption of electricity from
the years January 2013 to December 2017 is 3464988567.
75
5.1 DATASET USED
This research work focuses on the one specific industrial region of India named
Punjab for the prediction of electricity consumption in the future. To forecast the
amount of electricity consumption in the future, actual electricity consumed by
consumers from the previous year has been utilized. This real dataset of electricity
consumption taken from the power system of the public utility company Punjab
State Power Corporation Limited (PSPCL) from year January 2013 to December
2017. The detailed explanation of datasets has been provided below in tabular form.
Table 5.1 Dataset Used
Years (January 2013 –December 2017)
Original Electricity Consumption (KWh)
Jan-13 2402929222
Feb-13 2302632737
Mar-13 2302632737
Apr-13 2315442012
May-13 2763422169
Jun-13 3409077307
Jul-13 4283260815
Aug-13 4203919034
Sep-13 4320843017
Oct-13 3284669791
Nov-13 2660633086
Dec-13 2510323489
Jan-14 2406670480
Feb-14 2543659240
Mar-14 2262755935
Apr-14 2229496325
76
May-14 2663094360
Jun-14 4135881561
Jul-14 5048078068
Aug-14 5150618207
Sep-14 4184285964
Oct-14 3525215760
Nov-14 2914127320
Dec-14 2691580750
Jan-15 2425671683
Feb-15 2705001417
Mar-15 2387522949
Apr-15 2253062900
May-15 2884345926
Jun-15 3461754270
Jul-15 4895975328
Aug-15 5076894389
Sep-15 4867004448
Oct-15 3787014752
Nov-15 2804034438
Dec-15 2719630627
Jan-16 2567913778
Feb-16 2740311321
Mar-16 2942162298
Apr-16 2522244918
May-16 3253239971
Jun-16 4500446696
77
Jul-16 5018252044
Aug-16 4999539492
Sep-16 5404195697
Oct-16 3958725442
Nov-16 3130248433
Dec-16 2921627225
Jan-17 2902848056
Feb-17 2958390111
Mar-17 3209855845
Apr-17 2782373785
May-17 3705371880
Jun-17 4450434098
Jul-17 5810138389
Aug-17 5645879830
Sep-17 5273395962
Oct-17 4196118613
Nov-17 3191950404
Dec-17 3030487191
In table 5.1 gives the actual consumption of electricity from January 2013 to
December 2017. The average consumption of electricity from the years January
2013 to December 2017 is 3464988567.
5.2 RESULTS AND DISCUSSION
This section discusses the simulation of work that is done by using MATLAB
simulator to forecast the electricity consumption by using different techniques.
ARIMA is one of the most used forecasting models that is applied for prediction in
78
previous studies. Along with ARIMA, the utilized methods are further provided with
the simulation diagram.
Figure 5:2 Data panel
Figure 5.2 shows the data panel of MATLAB simulator which uses a single ARIMA
model. This interface is divided mainly into two regions; on the upper level of the
data panel, there are five buttons that provide the functionality for the simulation
work. The names of the buttons are self descriptive and are numbered in a sequence
in which the work is done. The buttons are (a) Select the dataset (b) Convert to
stationary (c) Generate the hypothesis (d) Model the ARIMA (e) Predict Next.
The working of each button is discussed below in detail.
79
Figure 5:3 Upload Data
Figure 5.3 depicts the process of uploading datasets in graphical form. The data has
been taken from previous years of electricity consumption. This graph along X-axis
and Y-axis correspondingly represents the number of days and consumption of
electricity in KWh (Kilowatt-hour). The obtained peak point of the waves in the
graph represents the average consumed electricity for the corresponding number of
days. As per the figure, it is clear that the consumption rate of electricity is enhanced
with the increasing months.
After successfully uploading the dataset into the panel, it has to be converted into
stationary for prediction purposes. There are two types of datasets available such as
stationary and non-stationary these concepts are essential in time series forecasting.
The stationary datasets are best suitable for forecasting in the future because this
kind of data has fixed characteristics for some specified time interval.
Stationary data: This type of data does not include any downward as well as an
upward trend or seasonal effects on it. The mean and variances of data must be
consistent over the infinite duration of time.
80
Non-stationary data: Data show patterns, seasonal effects, and other time-based
structures. The efficiency of forecasting is based upon observation time. As the time
increases, mean and variance changes, and trends are captured in the model. In this
forecasting based research work, data has to be converted into stationary.
Figure 5:4 Convert to stationary
Figure 5.4 shows the dataset after converting it into stationary. In this graph, three
kinds of waves are represented by different colors i.e. black, green and red. Among
them black is depicted as the actual previously consumed data. The waves
represented through the red line are depicted that value that makes a difference
among original and converted into stationary corresponds to the previously datasets
of consumed electricity.
The term autocorrelation defines the analysis of a process based on time. The
inference drawn as per autocorrelation function is typically known as analysis in the
time domain. The plotted graph of autocorrelation depicts the features of data used
for times series analysis. Particularly, the autocorrelation is designed to show
81
whether the elements of a time series are positive as well as negatively correlated or
independent to each other. The autocorrelation against lag by using different
algorithms such as ARIMA, CS and ANN, and Proposed Hybrid Model is depicted
below.
Figure 5:5 Generated hypothesis
Figure 5.5 shows the generated hypothesis of data based on autocorrelation. The
plotted graph shows the value of the autocorrelation function corresponding to the
vertical axis. It can range from -1 to 1 but in our case it varies for -0.4 to 1. In the
horizontal axis, the size of lag between elements of the time series is presented. Here
the lag is varied from 0 to 20. For example, if the lag is 2, the correlation between
time series elements and corresponding elements that were observed is supposed to
be two time periods earlier.
82
Figure 5:6 Original consumed and predicted electricity using ARIMA
In figure 5.6 the predicted value of consumed electricity by using ARIMA along
with the original value are shown. Red-colored waves in the graph depict the
original value of consumed electricity ranges from 0 to 140 along the horizontal
axis. The predicted amount of electricity is represented by black colored waves
within the same ranges of horizontally and vertically. In the graph, the number of
reading for electricity is represented along the x-axis and consumed as well as
predicted electricity is depicted along the y-axis. Red-colored waves represent the
predicted electricity at the same time interval and the number of entries by which we
analyze the performance of the presented model. The predicted electricity is not
much different from the originally consumed electricity. Therefore, in this case, the
accuracy is higher on the next phase the ARIMA forecasting model is used to
predict.
In figure 5.7, the forecasting value of consumed electricity is given by using
ARIMA model only.
83
Figure 5:7 Next Predicted electricity using ARIMA
After applying the ARIMA forecasting model to forecast the future electricity
consumption, optimization and validation technique were applied in next step.
Figure 5:8 Data panel of DWT - ARIMA
84
The data panel for applying the said algorithms with the ARIMA model is shown in
figure 5.8. After the selection of data, it has to be decomposed by applying a HAAR
wavelet transformation. Further steps such as convert into stationary datasets
generate hypothesis; a model by using the ARIMA forecasting model is discussed
below one after another.
Figure 5:9 Data uploading again
Figure 5.9 shows the results achieved after the data uploading is done again to utilize
the next operation of datasets such as decomposition of datasets, get the optimized
value of data.
85
Figure 5:10 Decomposition of data using Haar wavelet
In figure 5.10 the data decomposition process results after using HAAR wavelet
transformation is shown. DWT breaks down the signal into sub-bands of higher and
lower frequencies based on LPF and HPF. Whereas the low frequency and high
frequency further decompose into higher and lower frequencies, such as there are
four sub-bands; LL, HL, LH, and HH. These four decomposed sub-bands
represented through different colored in a graph such as red, blue, black, and green.
These four sub-bands are discussed below, along with their generated hypothesis
and predicted electricity consumption using the ARIMA forecasting model.
86
Figure 5:11 Decomposed (LL) data converted into stationary
Figure 5.11, shows the decomposed Lmin to Lmax (LL) datasets converted into
stationary form. The consumption of original electricity as well as predicted in the
range of is plotted along the Y-axis of the graph. The total number
of entries in the range of 0 to 70 is represented through X-axis. In graph three, waves
are plotted in three different colors, such as black, green, and red. These three
different colors of datasets depict distinguished properties of data. Among them, the
black colored waves show the original consumption of data. The forecast electricity
consumed is represented through green-colored waves. The red-colored waves show
the obtained rendered value in between original and corresponding predicted values.
As depicted in the figure 5.11 predicted value is far away from the original
consumption of electricity. Therefore, it is clear that the accuracy of LL decomposed
datasets is low. So, the results predicted from another decomposed datasets are
utilized for prediction.
87
Figure 5:12 Generated hypothesis of LL decomposed datasets
The generated hypothesis based on sample autocorrelation and lag of LL
decomposed datasets is shown in figure 5.12. The lag and sample correlation is
represented along Y-axis and X-axis correspondingly. The relation among originally
consumed electricity and the hypothetical value of electricity is represented through
autocorrelation. The black horizontal line is representing the originally consumed
electricity. Both upper and lower colored line represents the higher as well as, the
lower limit of hypothetical value. The red vertical line represents the obtained
sample autocorrelation corresponding to every lag. The obtained autocorrelation
corresponds to some lag that can be negative as well as positive. More than half of
the obtained autocorrelation values is far from the original electricity.
The predicted electricity using LL ARIMA is shown in figure 5.14, in which the
number of readings is represented along X-axis. The original, as well as the
predicted value of electricity, is plotted vertically. The two-colored waves in the
graph, i.e., red and black, are depicted as the original as well as predicted,
corresponds to that originally consumed electricity.
88
Figure 5:13 Original consumed and predicted electricity using LL ARIMA
The electricity representing line ranges from (-6 to 8) X 109 and the amount of
reading varies from 0 to 30. The predicted value of electricity is near about original
consumption of electricity. The highest and lowest variation among original and
predicted electricity corresponds to 15th and 19th number of readings. If the variation
of original and predicted electricity is lower than the performance is enhanced.
Figure 5:14 Decomposed (LH) data converted into stationary
89
In figure 5.14 shows the decomposed Lmax to Hmin (LH) datasets converted into
stationary form. The consumption of original as well as predicted electricity is
represented vertically, ranging from (-1.5 to 1) X 105. The consumed and predicted
electricity graph is plotted against the number of entries. The original consumed,
predicted, and rendered electricity is represented by three different colored waves,
such as black green and red. As the increasing number of entries, the value of these
three consumed, predicted, and rendered value of electricity is becoming
approximately equal.
Sa
mp
le A
uto
corr
ela
tion
Figure 5:15 Generated hypothesis of LH decomposed datasets
Figure 5.15 depicts the generated hypothesis of LH decomposed datasets based on
obtained autocorrelation. The black horizontal line depicts the originally consumed
electricity. The blue line represents the correlation value corresponding to each
value and a red vertical line represents every lag. In the figure, the value of some of
the autocorrelation is the same as the original consumed electricity value. However,
some autocorrelation is positive which means higher than the original value as well
as negative.
90
Pre
dic
tions
Figure 5:16 Original consumed and predicted electricity using LH ARIMA
Figure 5.16 shows original as well as predicted electricity using LH ARIMA. The
higher variation among predicted and the original value corresponds to the 16th and
14th number of readings.
Figure 5:17 Decomposed (HL) data converted into stationary
91
The decomposed HL datasets converted into stationary, as shown in figure 5.17. The
consumed, predicted, and rendered datasets are represented through black, green,
and red colored waves. The highest difference between these three values of
electricity is higher at the 120th number of entries. In which the value of predicted
electricity is negative, whereas there rendered electricity is negative.
Figure 5:18 Generated hypothesis of HL decomposed datasets
Figure 5.18 shows the generated hypothesis of HL decomposed datasets based on
lag and corresponding autocorrelation value. In which almost all obtained
autocorrelation is the same as originally consumed electricity. Some autocorrelation
is negative, as well as positive.
92
Figure 5:19 Original consumed and predicted electricity using HL ARIMA
The originally consumed electricity as well predicted value of electricity is shown in
figure 5.19. The highest and lowest variation among predicted and original
consumed electricity corresponds to the 29th and number of readings.
Figure 5:20 Decomposed (HH) data converted into stationary
93
Figure 5.20 shows the decomposed HH data converted into stationary. In which the
consumption of energy and the total number of entries is represented along vertical
and horizontal axis respectively. Graph shows black, green, and red colored waves
that represent consumed data, rendered value and predicted electricity consumption,
respectively.
Figure 5:21 Generated hypothesis of HL decomposed datasets
In figure 5.21 the generated hypothesis of HL decomposed datasets is shown. This
hypothesis is produced based on sample autocorrelation value in the graph along Y-
axis. The graph of sample autocorrelation is plotted against the lag along the X-axis.
As shown in the figure, the horizontal black colored line depicts the originally
consumed electricity. The red-colored vertical line, as well as dots overlapped on
black colored horizontal-line, represents the value of autocorrelation for
corresponding lag. Here, almost all the values of autocorrelation is overlapped on
originally consumed electricity. Therefore, the generated hypothesis is almost equal
to the original electricity value. It is evident from figure 5.21 that the accuracy of
this sub-band is higher than the other three sub-bands such as LL, LH, and HL.
The original, as well as predicted value, corresponds to the actual electricity value
that has been provided in figure 5.22. As depicted in the figure, the ranges of
electricity both actual consumed and predicted are (-6 to +6) X 10-7 along Y-axis.
0 2 4 6 8 10 12 14 16 18 20 Lag
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Sam
ple
Auto Correlation:
HH
94
Figure 5:22 Original consumed and predicted electricity using HH ARIMA
Figure 5.22 represents the original consumed and predicted electricity using HH
ARIMA, in which two colored waves i.e., black and red are shown, the consumed
electricity is represented through red-colored and black colored waves depict the
predicted value of electricity. The number of readings plotted along the X-axis
varies from 0 to 30.
Figure 5:23 Inverse-DWT
95
After applying the individual ARIMA model on decomposed sub-bands, i.e., LL,
LH, HL, and HH, the separated ARIMA model has to be combined by using the
Inverse-DWT as shown in Figure 5.23. In figure 5.24, the applied Inverse-DWT is
depicted by a combining ARIMA forecasting model with the next predicted
electricity value. This obtained forecasting value of electricity through combination
of four decomposed sub-bands is plotted in a graph in which the predicted electricity
value is represented along the Y-axis in the range of (0 to 16) x 109. The number of
entries is plotted along the X-axis varies from 0 to 60. The highest, as well as the
lowest obtained value of electricity, corresponds to the 29th and 31st entry,
respectively.
Figure 5:24 Data Panel for Daubechies Wavelet
Daubechies wavelet approach has also been used to decompose data. It is type of
orthogonal wavelet through which maximum number of vanishing moments has
been selected by using a scaling function also known as father wavelet. The main
advantage of segmentation is to reduce the error rate while performing the
predictions.
96
Figure 5:25 Segmentation done using Daubechies Wavelet Transform
The segmented data using Daubechies as a wavelet transform is shown in Figure
5.25. Using this approach, the Time Series Data is divided into number of small time
frames each comprising of identical features. The segmentation has been performed
linearly, which reduces the loss of data by avoiding the overlapping of small data by
the large data.
Figure 5:26 Differencing Applied on non-stationary data using Daubechies for LL
97
Differencing is applied to non stationary Time Series Data to convert it into
stationary Time Series Data. Among the four decomposed sub-parts of Daubechies
wavelet, the obtained differentiated value of LL is shown in figure 5.26. The
differentiated value of LL is depicted along the y-axis against the total number of
entries in the x-axis. There is a certain amount of distinction between the consumed,
rendered and adjacent rendered value of consumed electricity.
Figure 5:27 Differencing Applied on non-stationary data using Daubechies for LH
The case when the differencing is applied to the LH part of the non-stationary data is
depicted in Figure 5.27. The rendered data is the stationary data obtained after
differencing is applied on the 4th segment of the original data.
Figure 5:28 Differencing Applied on non-stationary data using Daubechies for HL
98
The differencing approach results obtained after the implementation of the HL part
of the non-stationary data is shown in Figure 5.28. In this, we convert the time series
into a stationary one using the difference. Differencing is a way to turn a non-
stationary Time Series Data into a stationary period. This is a significant step, which
must be followed before the preparation of data applied to the ARIMA model.
Figure 5:29 Differencing Applied on non-stationary data using Daubechies for HH
Consumption data is the data that is evaluated after the application of DWT.
HH represents a higher minimum and higher maximum value. Figure 5.29 shows the
obtained differentiated consumed electricity by using the Daubechies for HH applied
on non-stationary data. The obtained curve of consumed, rendered and adjacent
rendered electricity is representing that there is no significant difference among
them. The rendered plot is after the removal of the linear trend out of HH. The same
goes for LH, LL, and HL.
99
Figure 5:30 LL-Autocorrelation plot for Daubechies
An autocorrelation plot for the LL band of the Daubechies approach is shown in
Figure 5.30. The graph represents the time series relationship of the uploaded data.
In the graph, the vertical line has been drawn corresponding to each lag. From the
graph there is a vertical line with respect to each lag. The height of every vertical
line corresponding to each lag is representing the obtained value of the
autocorrelation for that lag. The autocorrelation corresponding to zero lag is always
equal to 1 due to the reason that it depicts the autocorrelation of each term among
itself.
100
Figure 5:31 LH-Autocorrelation plot for Daubechies
Autocorrelation graph plotted for LL, LH, HL and HH for different Lags is shown in
Figure 5.30, Figure 5.31, Figure 5.32 and Figure 5.33 respectively. It is clearly seen
that the values that is close to 0 represents higher correlation compared to the values
that are higher than the 0.5. Here, a boundary line has been set and those values that
come inside is considered for the training process. Using this approach, the training
process becomes much accurate and hence increases the prediction capability of the
design model.
101
Figure 5:32 HL-Autocorrelation plot for Daubechies
The sample auto-correlation determined for HL Daubechies approach lies between
+ 0.7 to -0.5 as shown in Figure 5.32.
Figure 5:33 HH-Autocorrelation plot for Daubechies
In figure 5.33, the HH-autocorrelation is plotted for daubechies. The ranges of
obtained autocorrelation sample corresponding to the lag is between +0.5 to -0.5.
102
Figure 5:34 ARIMA applied on LL segment of Daubechies for segment-based predictions
The predicted graph obtained for designed ARIMA model using Daubechies wavelet
for LL, LH, HL and HH segment is shown in Figure 5.34, Figure 5.35, Figure 5.36
and Figure 5.37 respectively.
Figure 5:35 ARIMA applied on LH segment of Daubechies for segment-based predictions
In figure 5.35 the applied ARIMA model on LH segment of daubechies for segment-
based predictions is hsown. The prediction is plotted vertically against number of
readings.
103
Figure 5:36 ARIMA applied on HL segment of Daubechies for segment-based predictions
Figure 5.36 depicts the segment-based predictions by applied daubechies on HL-
segment. In this graph predictions and number of readings are plotted vertically as
well as horizontally.
Figure 5:37ARIMA applied on HH segment of Daubechies for segment-based predictions
Figure 5.37 shows the predictions on the basis of segment by applied ARIMA model
on HH-segment. The curve of black and red colored depicted the ARIMA and
original consumption of electricity.
104
Figure 5:38 Segmentation of Time Series Data using HAAR decomposition fed to cuckoo search and further to NN
From previous experimental results it was observed HAAR wavelet transform
performed best. So, for accurate forecasting results, cuckoo search optimization
algorithm is applied to optimize the HAAR wavelet Transform. Figure 5.38
represents the segmentation of Time Series Data using decomposition technique
after that it is fed to CS optimization and then further to neural network (NN). The
four decomposed segments are represented through different colors such as Red
(LL), blue (LH), black (HL) and green (HH).
ANN is an efficient and successful alternative to ARIMA models for predicting the
time series relation with distinctive features.
h hidden layer activation function of hidden layer
Algorithm: Artificial Neural Network (ANN)
16 Start
17 Initialize ANN and define the basic feature as input/training data (T-Data),
Target (TR) and Neurons (N)
18 Set, Model-Net = Newff (T-Data, TR, N)
19 Model -Net.TrainParam.Epoch = 1000
20 Model -Net.Ratio.Training = 70%
105
21 Model -Net.Ratio.Testing = 15%
22 Model -Net.Ratio.Validation = 15%
23 Model -Net = Train (Model -Net, T-Data, TR)
24 Current Data = Feature of real-time data
25 Prediction = simulate (Model -Net, Current Data)
26 If Prediction = True
27 Results = Show predicted data
28 End
29 Return: Results in terms of prediction
30 End
There are some important steps that are utilized to get the result or to train the
network as defined below one after another;
i. Design of network structure: In the structural design of the network includes
a various input layer, hidden layers,and output layers.
ii. A number of hidden layers: The hidden layer is utilized to analyze the
problem and provide the best solution for that particular problem.
Theoretically, defined the hidden layer involving total numbers of neurons is
utilized for a particular task. In most cases, more than one hidden layer is
utilized to get an efficient result.
iii. The amount of output nodes usually depends on the parameter of a number
of the input nodes.
iv. Evaluation criteria: The most used error function could be produced easily is
the total of squared error. There are also a few of the error functions that
could be generated by distinct methods named as the least absolute deviation.
v. Training of Neural Network(NN): To provide training for a neural network,
it's necessary to learn various patterns of data. That data must be involve
training from an accuratesolution to produce an effective learning process.
The network is trained to overcome the globally maximum by pointing the
enormous set of weights among neurons.
106
Figure 5:39 Training Structure of ANN
The training process of load forecasting using TSD taken from PSPCL is provided
in figure 5.39. This figure is composed of four specified panels from top to bottom
in the neural network (NN) training tool named (a) Neural Network, (b) Algorithms,
(c) Progress, and (d) Plots. All these panels are explained below in detail.In the
training structure of ANN, the Neural Network (NN) composed of three units, such
as input followed by hidden layers, and the final unit is the output layer. According
to the given figure, only one neuron is passed to the input layer; 19 neurons are
added in the hidden layer and produce 20 neurons. The obtained 20 neurons are
107
transferred to the output layer, where neurons are subtracted through weight value
(w) and bias value (b). Then the total number of neurons obtained on the output
layer is 1.
The utilized algorithm by ANN for training purposes is listed inside the algorithms
section. The division of data is done in a random way; an algorithm used for training
is named levenberg- Marquardt. The performance of this training algorithm is
measured in terms of mean Square Error (MSE), and calculations are done in the
form of MEX. The training progress is shown under progress panel and measured in
terms of certain parameters named as epoch (3rd iterations), time, performance is
depicted next, gradient, mutation (mu), and validation checks.As per these discussed
parameters of training progress, if one of them is completed, the training of the
system is done.
Figure 5:40 Performance
The performance of the training algorithm is measured in terms of Mean Square
Error (MSR) is given in figure 5.40. The training process in load forecasting data is
completed for 1stiterations. In this figure, the MSE with respect to the epochs is
varied from 0 to 3, corresponds to the x-axis. On the y-axis, the variation of MSE is
108
from 10-2to 105. As per the graph plotted, it is seen that there are four different lines
represented through distinctcolors such as blue, green, red, and dotted line, and each
line correspondingly denotes the training, validation, test value, and best value of
ANN. At the 1st iteration, the best validation score is obtained 1816.7979 at epoch 1.
Figure 5:41Training State
The training state of the ANN classifier is representing in graphical form in Figure
5.41. The depicted waveforms it is obtained after the completion of the training
process and contains waveform named gradient, mutation, and validation checks
obtained correspondingly are 5.035e-08, 1e-06, and 2 for a maximum of 3 epochs.
109
Figure 5:42 Regression
The regression parameter is used to validate the performance of the proposed work
i.e., the electricity consumption of Punjab state in the future. The graph in Figure
5.42 represents the network output along with targets analyzed for training,
validation, and test data. Initially, the regression value is obtained 0.22608
corresponds to training. For validation andtesting, the obtained value of regression is
0.006307 and 0 correspondingly. The final value of regression is 0.1426, by which it
concluded that the presented solution provides a better architecture of the training
set and best suitable for the classification.
110
Figure 5:43 Differencing Applied on non-stationary LL data segment of Cuckoo-NN
The segmented data obtained using Haar transforms is optimized using CS approach
and then classified using NN approach. After this, differencing data is performed on
the data to obtain stable information. From the graph, more stability has been seen as
compared to the previous results obtained without the application of CS and ANN
approach. The data obtained after the differencing technique applied on the LL data
segment of Cuckoo-NN, LH data segment of Cuckoo-NN, HL data segment of
Cuckoo-NN, and HH data segment of Cuckoo-NN are depicted in the Figure 5.39,
Figure 5.43, Figure 5.44, and Figure 5.45 respectively.
Figure 5:44 Differencing Applied on non-stationary LH data segment of Cuckoo-NN
111
In figure 5.44 the differencing value of consumed electricity on non-stationary LH-
segment of data after application of cuckoo combined with neural network is shown.
The three different colored curves i.e. black, red, and green curve show the original
consumed, rendered and adjacent rendered electricity. It is clear from the graph that
the rendered value of electricity is not much differing from the original consumed
value of electricity.
Figure 5:45 Differencing Applied on non-stationary HL data segment of Cuckoo-NN
Figure 5.45 represents the differencing value of electricity of HL-segment after
applying cuckoo with NN on non stationary data. According to the figure, the black
colored curve is far away from red as well as green colored curve. That means the
originally consumed electricity is varying with a large amount of difference from
rendered as well as adjacent rendered electricity value.
112
Figure 5:46 Differencing Applied on non-stationary HH data segment of Cuckoo-NN
Figure 5.46 shows the obtained differencing value of the HH data segment by
applying cuckoo with a neural network. As per the figure, increase in the number of
entries increases the difference between electricity value.
Figure 5:47 ARIMA applied on Cuckoo-NN optimized LL segment for segment based predictions
The prediction results using CS with ANN in addition to ARIMA model for four
different decomposed band of the test data (LL, LH, HL, and HH) is shown in
Figure 5.48, Figure 5.49, Figure 5.50, and Figure 5.51 respectively. From all the
prediction results it has been concluded that best data prediction has been obtained
using the HL band followed by the HH band.
113
Figure 5:48 ARIMA applied on Cuckoo-NN optimized LH segment for segment based predictions
Figure 5.48 represents the segment based predictions of the LH data segment after
applying the ARIMA model with cuckoo search and NN. Obtained predictions and
number of readings are plotted in vertical and horizontal axis respectively.
Figure 5:49 ARIMA applied on Cuckoo-NN optimized HL segment for segment-based predictions
Figure 5.49 represents the optimized HL-segment for segment-based predictions.
This is obtained by applying the ARIMA model on combining the technique of CS
and NN.
114
Figure 5:50 ARIMA applied on Cuckoo-NN optimized HH segment for segment based predictions
Figure 5.50, shows the segment-based predictions of HH-segment after applying the
ARIMA model on the hybrid model obtained after combining the Cuckoo Search
and NN. The red colored and black curve correspondingly represents the original as
well as applied ARIMA.
In this research work, the result has been computed in MATLAB simulator with the
real-time data obtained from PSPCL. The actual dataset of electricity consumption
covers the period from January 2013 to December 2017. The year-wise comparison
of actual v/s predicted electricity consumption through different techniques namely
ARIMA, ARIMA with DWT (Haar and Daubechies), ARIMA with ABC and ANN,
ARIMA with DWT, CSA and ANN consumption for consecutive years is provided
below in the table as well as in the graph form.
5.2.1 Prediction using ARIMA Model
In this section, the actual as well as predicted consumption of electricity by using
only the ARIMA model is given below.
Table 5.1 Original and predicted electricity consumption using ARIMA
Base year 2013
Actual Electricity Consumption
Predicted Electricity Consumption ARIMA
Jan-13 2402929222 2605929243
Feb-13 2302632737 2512932937
115
Mar-13 2302632737 2512872739
Apr-13 2315442012 2517472037
May-13 2763422169 2953422175
Jun-13 3409077307 3618069309
Jul-13 4283260815 4458260827
Aug-13 4203919034 4404119058
Sep-13 4320843017 4541942132
Oct-13 3284669791 3480639189
Nov-13 2660633086 2819643092
Dec-13 2510323489 2723536491
In table 5.2 provides the actual as well as the predicted consumption of electricity
for year 2013. The average value of actual energy consumption is 3063315451, and
the average value of predicted energy consumption is 3262403269.
Figure 5:51 Electricity consumption Actual and Predicted using ARIMA of the Year 2013
Figure 5.51 shows the original and predicted values of electricity consumption
using ARIMA. In this figure, X-axis depicts the months of year (2013), and Y-axis
depicts Energy Consumption (KWh). The Blue line shows the actual electricity
consumption, and Red line shows the predicted electricity consumption using
ARIMA.
116
Table 5.2 Actual and predicted electricity consumption using ARIMA
As per the given table 5.3, the average value of actual and predicted energy
consumption is 3312955331 and 3515040238 respectively.
Figure 5:52 Electricity consumption original and Predicted using ARIMA of the Year 2014
Figure 5.52 shows the original and predicted values of electricity consumption
using ARIMA. In this figure, X-axis depicts the months of year (2014), and Y-axis
depicts Energy Consumption (KWh). The Blue line shows the actual electricity
consumption, and Red line shows the predicted electricity consumption using
ARIMA.
Base year 2014
Actual Electricity Consumption
Predicted Electricity Consumption
ARIMA Jan-14 2406670480 2618670409 Feb-14 2543659240 2737659298 Mar-14 2262755935 2471756951 Apr-14 2229496325 2479576319 May-14 2663094360 2813094134 Jun-14 4135881561 4315821563 Jul-14 5048078068 5248078065
Aug-14 5150618207 5350618211 Sep-14 4184285964 4384285966 Oct-14 3525215760 3725215762 Nov-14 2914127320 3114127323 Dec-14 2691580750 2891580753
117
Table 5.3 Original and predicted electricity consumption using ARIMA of the year 2015
Base year 2015
Actual Electricity Consumption
Predicted Electricity Consumption ARIMA
Jan-15 2425671683 2625671685
Feb-15 2705001417 2905001419 Mar-15 2387522949 2587522951
Apr-15 2253062900 2553062903
May-15 2884345926 3184345929
Jun-15 3461754270 3661754272 Jul-15 4895975328 5095975330
Aug-15 5076894389 5276894391
Sep-15 4867004448 5067004450 Oct-15 3787014752 3987014754
Nov-15 2804034438 3004034440
Dec-15 2719630627 2919630629
As per the given table 5.4, the average value of actual and predicted energy
consumption is 3355659427 and 3572326096 respectively.
Figure 5.53 shows the original and predicted values of electricity consumption using
ARIMA. X-axis depicts the months of year 2015, and Y-axis depicts the Energy
Consumption in KWh. The Blue line shows the original electricity consumption, and
Red line shows the predicted electricity consumption.
Figure 5:53 Electricity consumption actual and predicted using ARIMA for Year 2015
118
Table 5.4 Original and predicted electricity consumption using ARIMA for year 2016
Base year 2016
Actual Electricity Consumption
Predicted Electricity Consumption ARIMA
Jan-16 2567913778 2767913779
Feb-16 2740311321 2940311325 Mar-16 2942162298 3142162300
Apr-16 2522244918 2722244920
May-16 3253239971 3453239973
Jun-16 4500446696 4700446698 Jul-16 5018252044 5318252044
Aug-16 4999539492 5199539494
Sep-16 5404195697 5604195699 Oct-16 3958725442 4158725444
Nov-16 3130248433 3330248436
Dec-16 2921627225 3121627227
Figure 5:54 Electricity consumption Actual and Predicted using ARIMA for year 2016
Figure 5.54 shows the Electricity consumption actual and predicted using ARIMA.
In this figure, X-axis depicts the year (2016), and Y-axis depicts Energy
Consumption (KWh). The Blue line shows the original Electricity Consumption, and
Red line shows the Predicted Electricity Consumption using ARIMA. The average
value of Actual Energy Consumption is 3663242276, and the average value of
Predicted Energy Consumption is 3871575612.
119
Table 5.5 Actual and predicted electricity consumption using ARIMA of the year 2017
Base year 2016 Actual Electricity Consumption
Predicted Electricity Consumption ARIMA
Jan-17 2902848056 3102848058
Feb-17 2958390111 3158390114 Mar-17 3209855845 3409855847
Apr-17 2782373785 2982373787
May-17 3705371880 3905371882
Jun-17 4450434098 4650434100 Jul-17 5810138389 6010138391
Aug-17 5645879830 5845879832
Sep-17 5273395962 5473395964
Oct-17 4196118613 4396118616 Nov-17 3191950404 3391950406
Dec-17 3030487191 3230487193
Table 5.6 shows the Electricity consumption original and Predicted using ARIMA.
Figure 5:55 Electricity consumption original and Predicted using ARIMA of the Year 2017
In figure 5.55 X-axis depicts the year (2017), and Y-axis depicts Energy
Consumption (KWh). The Blue line shows the actual electricity consumption, and
Red line shows the predicted electricity consumption using ARIMA. The average
value of actual electricity consumption is 3929770347, and the average value of
predicted energy consumption is 4129770349.
120
5.2.2 Prediction using ARIMA with DWT
The following section provides the predicted values of actual as well as expected
electricity consumption using ARIMA with DWT. The corresponding tables and
graphs are explained below year wise.
Table 5.6 Original and predicted electricity consumption using ARIMA with DWT for year 2013
Base year 2013
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-13 2402929222 2512927412 Feb-13 2302632737 2412832137
Mar-13 2302632737 2512432431
Apr-13 2315442012 2503142081
May-13 2763422169 2613424165 Jun-13 3409077307 3318076315
Jul-13 4283260815 4363260314
Aug-13 4203919034 4318619035
Sep-13 4320843017 4419843015
Oct-13 3284669791 3324569175
Nov-13 2660633086 2725243173
Dec-13 2510323489 2631673474
As per the given table 5.7, the average value of Actual Energy Consumption is
3063315451, and the average value of Predicted ARIMA with DWT is 3138003561.
There is not much difference between the actual and predicted energy consumption
with ARIMA and DWT.
Figure 5:56 Electricity consumption original and Predicted using ARIMA with
DWT of the Year 2013
121
Figure 5.56 shows the predicted values of actual and predicted electricity
consumption using ARIMA and DWT. In this figure, X-axis depicts the year (2013),
and Y-axis depicts Energy Consumption (KWh). The Blue line shows the actual
Electricity Consumption, and Red line shows the Predicted ARIMA with DWT.
Table 5.7 Actual and predicted electricity consumption using ARIMA and DWT for year 2014
Base year 2014 Actual Electricity Consumption
Predicted Electricity Consumption
Jan-14 2406670480 2575279317
Feb-14 2543659240 2637653215
Mar-14 2262755935 2358743187 Apr-14 2229496325 2379496431
May-14 2663094360 2713719384
Jun-14 4135881561 4215633691
Jul-14 5048078068 5186154157 Aug-14 5150618207 5274619108
Sep-14 4184285964 4251293903
Oct-14 3525215760 3405615215
Figure 5:57 Electricity consumption Actual and Predicted using ARIMA with DWT for the Year 2014
122
Figure 5.57 shows the Electricity consumption values of Ariginal and Predicted
using ARIMA with DWT. X-axis depicts months of the year (2014), and Y-axis
depicts Energy Consumption in KWh. The Blue line shows the Actual Electricity
Consumption and Red line shows the Predicted ARIMA with DWT. The average
value of Actual Energy Consumption is 3312955331 and Predicted value using
ARIMA with DWT is 3401279918. There is not much difference between the
Actual and Predicted Energy Consumption using ARIMA with DWT.
Table 5.8 Actual and predicted electricity consumption using ARIMA with DWT of the year 2015
Base year 2015
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-15 2425671683 2619535715 Feb-15 2705001417 2815041757
Mar-15 2387522949 2413729279
Apr-15 2253062900 2315762148
May-15 2884345926 2915435928
Jun-15 3461754270 3557148373
Jul-15 4895975328 4935971351
Aug-15 5076894389 5176894383
Sep-15 4867004448 4961374389
Oct-15 3787014752 3817214768
Nov-15 2804034438 2915036481
Dec-15 2719630627 2808931643
123
Figure 5:58 Electricity consumption Actual and Predicted using ARIMA with DWT for the Year 2015
As given in figure 5.58, for year 2015 the average value of Actual Energy
Consumption is 3355659427, and the average value of Predicted values using
ARIMA with DWT is 3437673018. There is not much difference between the
Actual Energy Consumption and Predicted values using ARIMA with DWT.
Table 5.9 Actual and predicted electricity consumption using ARIMA with DWT for year 2016
Base year 2016
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-16 2567913778 2618313239
Feb-16 2740311321 2841312743
Mar-16 2942162298 3012963215
Apr-16 2522244918 2624247980 May-16 3253239971 3354231954
Jun-16 4500446696 4710647696
Jul-16 5018252044 5177523519
Aug-16 4999539492 5039731427 Sep-16 5404195697 5514396712
Oct-16 3958725442 4091795405
Nov-16 3130248433 3291248174
Dec-16 2921627225 3021523257
Table 5.10 shows the actual as well predicted electricity consumption values by
using ARIMA with DWT.
124
Figure 5.59 shows the Actual and Predicted Electricity consumption values using
ARIMA with DWT in graph form. In this figure, X-axis depicts the year (2016), and
Y-axis depicts Energy Consumption (KWh). The Blue line shows the Actual
Electricity Consumption, and Red line shows the Predicted consumption using
ARIMA with DWT. The average value of Actual Energy Consumption is
3663242276 and the average value of Predicted ARIMA with DWT is 3774827943.
There is not much difference between the Actual Energy Consumption and Predicted
ARIMA with DWT.
Figure 5:59 Electricity consumption Actual and Predicted using ARIMA with DWT for the Year 2016
Figure 5.60 shows the Electricity consumption original and Predicted using ARIMA
with DWT. In this figure, X-axis depicts the year (2017), and Y-axis depicts Energy
Consumption (KWh). The Blue line shows the original Electricity Consumption, and
Red line shows the Predicted Arima with DWT. The average value of Original
Energy Consumption is 3929770347, and the average value of Predicted Arima with
DWT is 4034206349 as shown in table 5.11. There is no much difference between
the original Energy Consumption and Predicted ARIMA with DWT.
125
Table 5.10 Actual and Predicted electricity consumption using ARIMA with DWT for the year 2017
Base year 2017
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-17 2902848056 3091431548
Feb-17 2958390111 3028196475 Mar-17 3209855845 3318257943
Apr-17 2782373785 2812353193
May-17 3705371880 3819365814
Jun-17 4450434098 4518914219 Jul-17 5810138389 5943118371
Aug-17 5645879830 5748170817
Sep-17 5273395962 5384295867 Oct-17 4196118613 4385918643
Nov-17 3191950404 3201805432
Dec-17 3030487191 3158647865
Figure 5:60 Electricity consumption original and Predicted using ARIMA with DWT for the Year 2017
5.2.3 Prediction using the Proposed Hybrid Model
In the previous section, two different techniques used with the ARIMA forecasting
model have been provided. The predicted values by using ARIMA is variying
126
largely from actual electricity consumption. But when using ARIMA with DWT is
utilized for decomposition purposes the expected obtained electricity consumption
value is nearest to the actual consumption value as compared to the ARIMA model.
After analyzing the predicted value of these techniques this sub-section discusses the
year wise explanations of actual and expected electricity consumption starting from
January 2013 to December 2017. Here, focus is on forecasting the predicted values
using integrated mechanism of CSA and ANN along with HAAR wavelet
decomposition technique followed by ARIMA model, and the discussion is given in
table form as well graphically below.
Table 5.11 Actual and predicted electricity consumption using Proposed Hybrid Model for the year 2013
Base year 2013
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-13 2402929222 2403030111
Feb-13 2302632737 2302834848
Mar-13 2302632737 2302432649
Apr-13 2315442012 2317842088 May-13 2763422169 2778422786
Jun-13 3409077307 3409098654
Jul-13 4283260815 4283234576
Aug-13 4203919034 4203998778 Sep-13 4320843017 4320897867
Oct-13 3284669791 3284635543
Nov-13 2660633086 2660667643
Dec-13 2510323489 2510368943
In table 5.12, the actual electricity consumption values for the year 2013 is provided.
The corresponding expected values for the electricity consumption is also provided.
It is clear from the obtained expected values; the predicted values and the actual
values are approximately close to each other. However, the actual electricity
consumed by users was about 3063315451. Consumers using the proposed hybrid
model are expected to consume approximately 3064788707 electricity. The original
and predicted value falls into the same range with a small difference.
127
Figure 5:61 Actual and Predicted values of electricity consumption using Proposed Hybrid Model for the Year 2013.
The graphical representation of actual electricity consumption by consumers for the
year 2013 is given in figure 5.61. Blue and red curves in the graph represents actual
and predicted consumption of electricity by using the hybrid proposed model. There
is very little difference in actual and predicted consumed electricity, so the actual
curve is overlapping the predicted electricity value. The graph represents the
consumed electricity in KWh and year along vertical and horizontal axis.
Table 5.12 Original and predicted electricity consumption using Proposed Hybrid Model for the year 2014
Base year 2014 Actual Electricity Consumption
Predicted Electricity Consumption
Jan-14 2406670480 2406632260
Feb-14 2543659240 2543624804
Mar-14 2262755935 2236279532
Apr-14 2229496325 2272945793
May-14 2663094360 2663776436
Jun-14 4135881561 4135897042
Jul-14 5048078068 5048008640
Aug-14 5150618207 5150087643
Sep-14 4184285964 4184295425
Oct-14 3525215760 3525243579
Nov-14 2914127320 2914325679
Dec-14 2691580750 2691535689
128
Table 5.13 depicts the actual as well as predicted consumed electricity for the year
2014. The average of actual consumed electricity is 3312976254. The average of
consumed predicted electricity value is 3314387710. There is insignificant variation
between the actual and predicted value of electricity.
Figure 5.62 represents the consumption of original as well as the predicted
electricity value. Along Y-axis and X-axis correspondingly represent the electricity
consumption and the year 2014 from January to December. The Blue and Red-
colored curves in the graph show the actual and predicted value of electricity.
Figure 5:62 Actual and Predicted values of electricity consumption using Proposed Hybrid Model for the Year 2014.
Table 5.13 Actual and predicted electricity consumption using Proposed Hybrid Model for the year 2015
Base year 2015
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-15 2425671683 2425635790
Feb-15 2705001417 2700245689
Mar-15 2387522949 2308754670
Apr-15 2253062900 2253035678
May-15 2884345926 2884309875
Jun-15 3461754270 3461733256
Jul-15 4895975328 4859345678
Aug-15 5076894389 5007646789
Sep-15 4867004448 4867009865
Oct-15 3787014752 3770567899
129
Nov-15 2804034438 2804056789
Dec-15 2719630627 2719634567
In table 5.14 the actual and predicted values for consumption of electricity for the
2015 are shown. The average actual electricity consumption is 3355659427. The
obtained average predicted electricity consumption is 3338498054. Hence there is
no significant amount of difference among these actual and predicted values of
electricity consumption.
Figure 5:63 Electricity consumption Actual and Predicted using Proposed Hybrid Model for the Year 2015
The actual consumed electricity by the consumer for the year 2015 is provided in
figure 5.63. A blue-colored curve denotes the actual value of consumed electricity,
and the expected value is depicted through the red-colored curve. In the graph, the
actual value is overlapped with the expected values of consumed electricity because
of very little difference among these values.
130
Table 5.14 Actual and predicted electricity consumption using Proposed Hybrid Model for the year 2016
Base year 2016
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-16 2567913778 2567945678
Feb-16 2740311321 2740310963
Mar-16 2942162298 2904213480
Apr-16 2522244918 2522123456
May-16 3253239971 3253234568
Jun-16 4500446696 4500412997
Jul-16 5018252044 5020932568
Aug-16 4999539492 4999500532
Sep-16 5404195697 5400412340
Oct-16 3958725442 3905872349
Nov-16 3130248433 3103020975
Dec-16 2921627225 2921606532
Table 5.15 represents the actual and predicted value of consumed electricity for the
year 2016.. The average value corresponding to actual electricity consumption is
3663242276 and for predicted consumption of electricity the average value is
3672383639.
Figure 5:64 Electricity consumption Actual and Predicted using Proposed Hybrid Model for the Year 2016
The graphical representation for the year 2016 is given in figure 5.64. The electricity
consumption (KWh) is plotted along Y-axis against the years 2016 corresponds to
131
X-axis. While blue color is representing the consumed electricity and the red color
curve corresponds to the expected consumed electricity value.
Table 5.15 Actual and predicted electricity consumption using Proposed Hybrid Model for the year 2017
Base year 2017
Actual Electricity Consumption
Predicted Electricity Consumption
Jan-17 2902848056 2902832345
Feb-17 2958390111 2905898637
Mar-17 3209855845 3218576445
Apr-17 2782373785 2723987754
May-17 3705371880 3705344790
Jun-17 4450434098 4404334578
Jul-17 5810138389 5801010875
Aug-17 5645879830 5645898752
Sep-17 5273395962 5273897987
Oct-17 4196118613 4196889768
Nov-17 3191950404 3109112390
Dec-17 3030487191 3004009043
Table 5.16 represents the actual and predicted consumed electricity that corresponds
to the year 2017.
Figure 5:65 Electricity consumption original and Predicted using Proposed Hybrid Model of the Year 2017
132
The obtained average value for actual electricity, as well as predicted electricity
consumption is 3929753864 and 3907684235. The consumption of electricity value
of actual and predicted value is represented graphically in figure 5.65. The blue
colored curve depicts the original consumed electricity value and a red-colored
curve represents the predicted consumed electricity value.
Figure 5:66 Overall comparison of electricity consumption prediction of ARIMA
+DWT, Hybrid proposed model with the original dataset
Figure 5.66 depicts the trend between the actual and predicted electricity
consumption which follows the same trend. The is no much difference depicted
between the original and predicted consumption. The graph clearly shows that both
graphs follow the same trend. However, prediction using the ARIMA and DWT
provides better results than the ARIMA only. The average computed value of the
ARIMA, ARIMA with DWT and proposed hybrid model is 3464988567,
3557198158, and 3455724556. Thus, the average difference between the original
and by using ARIMA with the DWT model is 3%, and that of the proposed model is
0.39%. Thus, our proposed hybrid model using ARIMA, CSA, DWT and ANN
provides better results as compared to both only ARIMA and ARIMA with DWT
approach.
At the end, to examine the performance of this research work,Mean Absolute
Percentage Error(MAPE), Mean Average Precision (MAP), and Accuracy (%) are
utilized as an evaluation parameter. Each one of them are explained below:
133
Mean Absolute Percentage Error (MAPE)
MAPE is a measure to compute the amount of dependent series that varies from its
level of the predicted model. This parameter is independent of units and can so it can
be used to compare series with distinct units. MAPE demonstrates the performance
in the percentage of the error, and it can be expressed mathematically as;
, denotes the actual sequence, Forecasted electricity values and P
represents the number of samples.
Mean Average Precision (MAP)
The parameter precision means more than two values of the measurements that are
close to each other. The precision value is different due to the prediction error.
Higher precision depicts the result measurement is constant, and low precision
depicts the varying measurement. But all time is not necessary; the higher precision
produces an enhanced result. The mathematical expression to compute precision is
provided below;
Where, TP = True positive, FP= False Positive
Mean average precision for any collection of electricity consumption datasets is
defined as the mean of the average precision scores for every corresponding data by
applying different techniques. The mathematical expression of MAP by using
average precision is provided below;
134
Where denotes the number of desired samples, is the number of retrieved
samples, corresponds to the average of precision at level .
Accuracy (%)
The ability of the system to measure accurate value means it defines the closeness
for the measured value to a true value. The computation of accuracy can be done by
using the small reading through which the error is reduced. The accuracy can be
defined mathematically as provided below;
5.2.4 Computed parameters
To analyze the prediction performance of various models including the proposed
hybrid model and to prove the effectiveness of the proposed model the results of
predicted electricity consumption has been presented in the table 5.17. The
parameters such as MAP (Mean Average Precision), MAPE (Mean Absolute
Percentage Error) and accuracy has been utilized to analyze the performance.
Table 5.16 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model
MODEL MAP
ARIMA 0.4623940
ARIMA with DWT 0.894494
Proposed Hybrid Model 0.94521
Table 5.17 depicts the various techniques used in proposed work along with the
examined MAP values. The graphical representation of the same is shown in Figure
5.67. There is an increase of accuracy by 5.67 % compared to the ARIMA with the
DWT model.
135
Figure 5:67 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model
From the graph, it is clearly seen that the maximum MAP of 0.94521 has been
attained using the hybrid approach.
Table 5.17 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model
MODEL MAPE
ARIMA 44.239359
ARIMA with DWT 26.438503
Proposed Hybrid Model 8.944945
Table 5.18 depicts the MAPE corresponding to various techniques used in the
proposed work. The Proposed Hybrid Model shows the lowest Mean Average
Percentage Error when compared with other techniques.
Figure 5:68 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model
136
The graphical representation of the MAPE is shown in figure 5.68. Y-axis is
depicting the obtained MAPE value corresponding to ARIMA, ARIMA with DWT,
and the Proposed Hybrid Model. As per the reduction in MAPE, the performance of
the system has been enhanced. The MAPE of the proposed hybrid model is lowest.
After applying ARIMA with the DWT technique, there is a decrease of 17.800856
as compared to utilizing only ARIMA. Whereas applying Proposed Hybrid Model,
there is a decrease of 17.493558 as a contrast to ARIMA with DWT.
Table 5.18 Computed Accuracy for Different Combinations
Used techniques Accuracy (%)
ARIMA 83.53
ARIMA with DaubechiesWavelet 92.67
ARIMA with HAAR Wavelet 93.76
Proposed Hybrid Model 98.86
Table 5.19 depicts the various techniques used in work. The ARIMA technique
shows the Accuracy of 83.53%, ARIMA with the DaubechiesWavelet shows the
accuracy of 92.67%, ARIMA with HAAR Wavelet the accuracy of 93.76 % and
Proposed Hybrid Model technique shows the Accuracy of 98.86%.
Figure 5:69 Computed Accuracy (%) for ARIMA, ARIMA with DWT, ARIMA with HAAR and Proposed Hybrid Model
137
After applying ARIMA and ARIMA with the DWT technique, there is an increase
of 4.6 % accuracy as shown in figure 5.69. Whereas applying Proposed Hybrid
Model there is an increment in accuracy by 1.15% as compared to ARIMA with
DWT.
Thus there is an increase in the accuracy of the proposed work of 18.35 %, 6.68 %,
and 5.44 % from the ARIMA, ARIMA with Daubechies, and ARIMA with Haar
techniques respectively.
138
CHAPTER 6: CONCLUSION AND FUTURE SCOPE
6.1 CONCLUSION
This research focuses on design and development of a novel intelligent technique
which can be used to study the future behaviour of electricity consumption on the
basis of Time Series Data. The analysis of future behaviour in relation to very
sudden changes in the time series data of consumed electricity is very complex and
major challenge for the electricity providers and investors as well. However, the
benefits associated with accurate forecasting have prompted researchers to develop
new and advanced models.
Predicting the next values of time series has been a major research problem that
attracts researchers from numerous fields. In this research, short-term forecasting is
studied in one step ahead and many steps ahead modeling and three forecasting
models are compared, namely, ARIMA, ARIMA with DWT and Proposed hybrid
model. Time series from different applications generally consists of both linear and
non linear variations. Linear ARIMA and ARIMA with DWT models cannot
accurately model this data separately. A hybrid model is proposed, which is an
integration of individual models such as ARIMA, DWT, CSA and ANN. By taking
the advantages of all these techniques a novel model is designed with high
prediction accuracy.
The first one or the basic model is ARIMA with DWT, which was proposed in this
research by using the statistical features of ARIMA model. The MA filter has been
utilized to decompose the available time series electricity data into two data sets that
consists of lower and upper level of data, which was later used to forecast the data
obtained using hybrid model. This hybrid model is able to predict electricity
consumption at the earliest. The designed model was applied to simulate Time
Series Data and forecast electricity consumption in Punjab State.
The proposed hybrid model using Discrete Wavelet Transformation (DWT), Cuckoo
Search (CS) algorithm and Artificial Neural network (ANN) was the final one,
139
which is used to estimate and forecast electricity demand/consumption using a
stochastic process. In the proposed electricity forecasting model, the wavelet
transform technique has been applied to reduce the white noise present in the
original dataset taken from PSPCL from 2013 to the 2017 year and hence obtained a
more stable dataset compared to the original dataset. DWT is applied as wavelet
transformation, which decomposes a signal into an essential orthogonal function of
different frequencies. The main feature of DWT is that it is totally lossless
transformation. We can regain our original signal while using reverse DWT. The
electricity data is non-stationary as its consumption varies continuously over time.
Therefore, DWT is the best way to express this type of data. This is done to forecast
highly accurate value by using a simple technique named as ARIMA model. The
work has been performed using two combinations of wavelets that are Daubechies
and HAAR wavelet. After analysis, it has been observed that prediction using the
HAAR wavelet provides better results compared to the Daubechies approach.
Therefore, HAAR has been selected as a wavelet decomposition approach, and then
ARIMA, CS with ANN has been applied to enhance the prediction performance of
the proposed work.
The comparison of predicted energy consumption values as well as the MAP,
MAPE, and Accuracy (%) of the proposed model have been compared with the
traditional approachs. The experimental values show that the predicted values using
the proposed model are highly correlated with the original dataset, which indicates
that the designed model is efficient and highly accurate to predict electricity
consumption. Thus, the increase in the accuracy by proposed model is 10.23 %
when compared with ARIMA, 6.19 % when compared with ARIMA and
Daubechies and 5.1 % with ARIMA and HAAR respectively.
140
6.2 FUTURE SCOPE
In future this work can be extended using other traditional classifiers such as
Support Vector Machine (SVM), Fuzzy Logic, Convolutional Neural Network and
other techniques. Also, for data optimization other techniques such as Genetic
Algorithm (GA), Particle Swarm Optimization (PSO), Firefly algorithms can be
used.
Here, MA filter is used as pre-processing scheme using ARIMA model. Other
existing pre-processing methods that were presented in literature can also be
experimentally validated for better prediction accuracy. An appropriate pre-
processing method can also be properly designed to increase forecast accuracy to
adapt to a well forecasting model.
The model can be applied for other statistical analysis which can be used to predict
various time series data i.e. live-stock product, agricultural yield, health expenditure,
currency exchange rate and many more.
141
RESEARCH PUBLICATIONS
1. Kaur, H. and Ahuja, S., 2017. Time series analysis and prediction of electricity consumption of health care institution using ARIMA model.. Advances in Intelligent Systems and Computing, vol 547.Springer, Singapore
2. Kaur, H. and Ahuja, S. (2019). A Hybrid Arima and Discrete Wavelet Transform Model for Predicting the Electricity Consumption of Punjab. International Journal of Innovative Technology and Exploring Engineering, 8(11), pp.1915-1919
3. Kaur, H. and Ahuja, S. (2019). SARIMA Modelling for Forecasting the Electricity Consumption of a Health Care Building. International Journal of Innovative Technology and Exploring Engineering, 8(12), pp.2795-2799.
142
REFERENCES
[1] Agnetis, A., De Pascale, G., Detti, P., &Vicino, A. (2013). Load scheduling for household energy consumption optimization. IEEE Transactions on Smart Grid, 4(4), 2364-2373.
[2] Pérez-Lombard, L., Ortiz, J., & Pout, C. (2008). A review on buildings energy consumption information. Energy and buildings, 40(3), 394-398.
[3] Taylor, J. W., McSharry, P. E., &Buizza, R. (2009). Wind power density forecasting using ensemble predictions and time series models. IEEE Transactions on Energy Conversion, 24(3), 775-782.
[4] Fu, T. C. (2011). A review on Time Series Data mining. Engineering Applications of Artificial Intelligence, 24(1), 164-181.
[5] Butcher, J. B., Verstraeten, D., Schrauwen, B., Day, C. R., & Haycock, P. W. (2013). Reservoir computing and extreme learning machines for non-linear time-series data analysis. Neural networks, 38, 76-89.
[6] Turchin, P. (1993). Chaos and stability in rodent population dynamics: evidence from non-linear time-series analysis. Oikos, 167-172.
[7] Brahim-Belhouari, S., &Bermak, A. (2004). Gaussian process for nonstationary time series prediction. Computational Statistics & Data Analysis, 47(4), 705-712.
[8] Fan, S., & Hyndman, R. J. (2010, December). Forecast short-term electricity demand using semi-parametric additive model. In 2010 20th Australasian Universities Power Engineering Conference (pp. 1-6).IEEE.
[9] Willis, H., &Aanstoos, J. (1979).Some unique signal processing applications in power system planning. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(6), 685-697
[10] Reisen, V. A., & Lopes, S. (1999). Some simulations and applications of forecasting long-memory time-series models. Journal of Statistical Planning and Inference, 80(1-2), 269-287.
[11] Landassuri-Moreno, V. M., Bustillo-Hernández, C. L., Carbajal-Hernández, J. J., & Fernández, L. P. S. (2013, November). Single-step-ahead and multi-step-ahead prediction with evolutionary artificial neural networks. In Iberoamerican Congress on Pattern Recognition (pp. 65-72). Springer, Berlin, Heidelberg.
[12] Hansen, J. V., & Nelson, R. D. (2003). Forecasting and recombining time-series components by using neural networks. Journal of the Operational Research Society, 54(3), 307-317.
[13] Ho, S. L., Xie, M., & Goh, T. N. (2002). A comparative study of neural network and Box-Jenkins ARIMA modeling in time series prediction. Computers & Industrial Engineering, 42(2-4), 371-375.
[14] Fister, I., Yang, X. S., &Fister, D. (2014). Cuckoo search: a brief literature review. In Cuckoo search and firefly algorithm (pp. 49-62). Springer, Cham.
[15] Roy, S., & Chaudhuri, S. S. (2013). Cuckoo search algorithm using Lévy flight: a review. international journal of Modern Education and Computer Science, 5(12), 10.
[16] Mareli, M., & Twala, B. (2018). An adaptive Cuckoo search algorithm for optimisation. Applied computing and informatics, 14(2), 107-115.
143
[17] Gandomi, A. H., Yang, X. S., &Alavi, A. H. (2013). Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems. Engineering with computers, 29(1), 17-35.
[18] Jiang, P., Liu, F., Wang, J., & Song, Y. (2016). Cuckoo search-designated fractal interpolation functions with winner combination for estimating missing values in time series. Applied Mathematical Modelling, 40(23-24), 9692-9718.
[19] Kim, M. K. (2015). Short-term price forecasting of Nordic power market by combination Levenberg–Marquardt and Cuckoo search algorithms. IET Generation, Transmission & Distribution, 9(13), 1553-1563.
[20] Ong, P., & Zainuddin, Z. (2019). Optimizing wavelet neural networks using modified cuckoo search for multi-step ahead chaotic time series prediction. Applied Soft Computing, 80, 374-386.
[21] Awan, S. M., Aslam, M., Khan, Z. A., & Saeed, H. (2014). An efficient model based on artificial bee colony optimization algorithm with Neural Networks for electric load forecasting. Neural Computing and Applications, 25(7-8), 1967-1978.
[22] Tealab, A., Hefny, H., & Badr, A. (2017). Forecasting of nonlinear time series using ANN. Future Computing and Informatics Journal, 2(1), 39-47.
[23] Yao, X. (1999). Evolving artificial neural networks. Proceedings of the IEEE, 87(9), 1423-1447.
[24] Dawson, C. W., & Wilby, R. L. (2001). Hydrological modelling using artificial neural networks. Progress in physical Geography, 25(1), 80-108.
[25] Hippert, H. S., Pedreira, C. E., & Souza, R. C. (2001). Neural networks for short-term load forecasting: A review and evaluation. IEEE Transactions on power systems, 16(1), 44-55.
[26] Paliwal, M., & Kumar, U. A. (2009). Neural networks and statistical techniques: A review of applications. Expert systems with applications, 36(1), 2-17.
[27] Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis. European journal of operational research, 116(1), 16-32.
[28] Qian, Z., Pei, Y., Zareipour, H., & Chen, N. (2019). A review and discussion of decomposition-based hybrid models for wind energy forecasting applications. Applied Energy, 235, 939-953.
[29] Zarnowitz, V., & Ozyildirim, A. (2006). Time series decomposition and measurement of business cycles, trends and growth cycles. Journal of Monetary Economics, 53(7), 1717-1739.
[30] Sang, Y. F. (2013). A review on the applications of wavelet transform in hydrology time series analysis. Atmospheric research, 122, 8-15.
[31] Rhif, M., Ben Abbes, A., Farah, I. R., Martínez, B., & Sang, Y. (2019). Wavelet transform application for/in non-stationary time-series analysis: a review. Applied Sciences, 9(7), 1345.
[32] Conejo, A. J., Plazas, M. A., Espinola, R., & Molina, A. B. (2005). Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE transactions on power systems, 20(2), 1035-1042.
[33] Kiplangat, D. C., Asokan, K., & Kumar, K. S. (2016). Improved week-ahead predictions of wind speed using simple linear models with wavelet decomposition. Renewable Energy, 93, 38-44.
[34] Nourani, V., Baghanam, A. H., Adamowski, J., & Kisi, O. (2014). Applications of hybrid wavelet–artificial intelligence models in hydrology: a review. Journal of Hydrology, 514, 358-377.
144
[35] Hou, Z., Makarov, Y. V., Samaan, N. A., & Etingov, P. V. (2013, January). Standardized Software for Wind Load Forecast Error Analyses and Predictions Based on Wavelet-ARIMA Models--Applications at Multiple Geographically Distributed Wind Farms. In 2013 46th Hawaii International Conference on System Sciences (pp. 5005-5011). IEEE.
[36] Nandanwar, L., &Mamulkar, K. (2015). Supervised, semi-supervised and unsupervised WSD approaches: An overview. International Journal of Science and Research (IJSR), 4(2), 1684-1688.
[37] Caruana, R., & Niculescu-Mizil, A. (2006, June). An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning (pp. 161-168).
[38] Ghahramani, Z. (2003, February). Unsupervised learning. In Summer School on Machine Learning (pp. 72-112). Springer, Berlin, Heidelberg.
[39] Huang, G., Song, S., Gupta, J. N., & Wu, C. (2014). Semi-supervised and unsupervised extreme learning machines. IEEE transactions on cybernetics, 44(12), 2405-2417.
[40] Chapelle, O., Scholkopf, B., &Zien, A. (2009). Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Transactions on Neural Networks, 20(3), 542-542.
[41] Zhu, X., & Goldberg, A. B. (2009). Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 3(1), 1-130.
[42] Morgan, M. G., & Talukdar, S. N. (1979). Electric power load management: Some technical, economic, regulatory and social issues. Proceedings of the IEEE, 67(2), 241-312.
[43] Mishra, S., & Singh, S. N. (2015, December). Indian electricity market: Present status and future directions. In 2015 IEEE UP Section Conference on Electrical Computer and Electronics (UPCON) (pp. 1-7). IEEE.
[44] Rallapalli, S. R., & Ghosh, S. (2012). Forecasting monthly peak demand of electricity in India—A critique. Energy policy, 45, 516-520.
[45] Bhargava, N., & Gupta, S. (2006). The Punjab state electricity board: past, present and future. Panjab University research Journal (Arts), 33(2), 93-104.
[46] Willis, H., &Aanstoos, J. (1979).Some unique signal processing applications in power system planning. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(6), 685-697.
[47] Fan, S., & Hyndman, R. J. (2010, December). Forecast short-term electricity demand using semi-parametric additive model. In 2010 20th Australasian Universities Power Engineering Conference (pp. 1-6).IEEE.
[48] Hong, T., Hsiang, S. M., & Xu, L. (2009, July). Human-machine co-construct intelligence on horizon year load in long term spatial load forecasting. In 2009 IEEE Power & Energy Society General Meeting (pp. 1-6). IEEE.
[49] Topalli, A. K., &Erkmen, I. (2003). A hybrid learning for neural networks applied to short term load forecasting. Neurocomputing, 51, 495-500.
[50] Ghiassi, M. D. K. Z., Zimbra, D. K., &Saidane, H. (2006). Medium term system load forecasting with a dynamic artificial neural network model. Electric power systems research, 76(5), 302-316.
[51] Carpinteiro, O. A., Leme, R. C., de Souza, A. C. Z., Pinheiro, C. A., & Moreira, E. M. (2007). Long-term load forecasting via a hierarchical neural model with time integrators. Electric Power Systems Research, 77(3-4), 371-378.
145
[52] Amjady, N., &Keynia, F. (2008).Mid-term load forecasting of power systems by a new prediction method. Energy Conversion and Management, 49(10), 2678-2687.
[53] Soares, L. J., & Medeiros, M. C. (2008).Modeling and forecasting short-term electricity load: A comparison of methods with an application to Brazilian data. International Journal of Forecasting, 24(4), 630-644.
[54] Pedregal, D. J., &Trapero, J. R. (2010). Mid-term hourly electricity forecasting based on a multi-rate approach. Energy Conversion and Management, 51(1), 105-111.
[55] Darbellay, G. A., &Slama, M. (2000). Forecasting the short-term demand for electricity: Do neural networks stand a better chance?. International Journal of Forecasting, 16(1), 71-83.
[56] El-Telbany, M., & El-Karmi, F. (2008).Short-term forecasting of Jordanian electricity demand using particle swarm optimization. Electric Power Systems Research, 78(3), 425-433
[57] Kandil, N., Wamkeue, R., Saad, M., & Georges, S. (2006, July). An efficient approach for shorterm load forecasting using artificial neural networks. In 2006 IEEE International Symposium on Industrial Electronics (Vol. 3, pp. 1928-1932).IEEE.
[58] Xiao, Z., Ye, S. J., Zhong, B., & Sun, C. X. (2009). BP neural network with rough set for short term load forecasting. Expert Systems with Applications, 36(1), 273-279.
[59] Catalão, J. P. D. S., Mariano, S. J. P. S., Mendes, V. M. F., & Ferreira, L. A. F. M. (2007). Short-term electricity prices forecasting in a competitive market: A neural network approach. Electric Power Systems Research, 77(10), 1297-1304.
[60] Goude, Y., Nedellec, R., & Kong, N. (2013). Local short and middle term electricity load forecasting with semi-parametric additive models. IEEE transactions on smart grid, 5(1), 440-446.
[61] Minaye, E., &Matewose, M. (2013).Long term load forecasting of Jimma town for sustainable energy supply. International Journal of Science and Research, 5(2), 1500-1504.
[62] Willis, H. L., & Romero, J. (2007).Spatial electric load forecasting methods for electric utilities. Quanta Technology
[63] Hong, T., & Fan, S. (2016). Probabilistic electric load forecasting: A tutorial review. International Journal of Forecasting, 32(3), 914-938.
[64] Salvó, G., &Piacquadio, M. N. (2017).Multifractal analysis of electricity demand as a tool for spatial forecasting. Energy for Sustainable Development, 38, 67-76.
[65] Temraz, H. K., Salama, M. M. A., & Chikhani, A. Y. (1997, May). Review of electric load forecasting methods. In CCECE'97. Canadian Conference on Electrical and Computer Engineering. Engineering Innovation: Voyage of Discovery. Conference Proceedings (Vol. 1, pp. 289-292). IEEE.
[66] Al-Hamadi, H. M. (2011, September). Long-term electric power load forecasting using fuzzy linear regression technique. In 2011 IEEE Power Engineering and Automation Conference (Vol. 3, pp. 96-99).IEEE.
[67] AlRashidi, M. R., & El-Naggar, K. M. (2010). Long term electric load forecasting based on particle swarm optimization. Applied Energy, 87(1), 320-326.
[68] Al-Saba, T., & El-Amin, I. (1999). Artificial neural networks as applied to long-term demand forecasting. Artificial Intelligence in Engineering, 13(2), 189-197
[69] Chatfield, C. (2001). Prediction intervals for time-series forecasting.In Principles of forecasting (pp. 475-494).Springer, Boston, MA.
146
[70] Khan, M. M. H., Muhammad, N. S., & El-Shafie, A. (2020). Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting. Journal of Hydrology, 125380.
[71] Dong, B., Li, Z., Rahman, S. M., & Vega, R. (2016).A hybrid model approach for forecasting future residential electricity consumption. Energy and Buildings, 117, 341-351.
[72] David, M., Ramahatana, F., Trombe, P. J., &Lauret, P. (2016).Probabilistic forecasting of the solar irradiance with recursive ARMA and GARCH models. Solar Energy, 133, 55-72
[73] Alsharif, M. H., Younes, M. K., & Kim, J. (2019). Time series arima model for prediction of daily and monthly average global solar radiation: The case study of seoul, south korea. Symmetry, 11(2), 240.
[74] Al-Musaylh, M. S., Deo, R. C., Adamowski, J. F., & Li, Y. (2018).Short-term electricity demand forecasting with MARS, SVR and ARIMA models using aggregated demand data in Queensland, Australia. Advanced Engineering Informatics, 35, 1-16.
[75] Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175.
[76] Bedi, J., &Toshniwal, D. (2019).Deep learning framework to forecast electricity demand. Applied energy, 238, 1312-1326.
[77] Khashei, M., &Bijari, M. (2010).An artificial neural network (p, d, q) model for timeseries forecasting. Expert Systems with applications, 37(1), 479-489.
[78] Babu, C. N., & Reddy, B. E. (2014). A moving-average filter based hybrid ARIMA–ANN model for forecasting Time Series Data. Applied Soft Computing, 23, 27-38.
[79] Khandelwal, I., Adhikari, R., &Verma, G. (2015). Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition. Procedia Computer Science, 48, 173-179.
[80] Lee, W. J., & Hong, J. (2015). A hybrid dynamic and fuzzy time series model for mid-term power load forecasting. International Journal of Electrical Power & Energy Systems, 64, 1057-1062.
[81] Rana, M., &Koprinska, I. (2016).Forecasting electricity load with advanced wavelet neural networks. Neurocomputing, 182, 118-132.
[82] Dudek, G. (2016). Pattern-based local linear regression models for short-term load forecasting. Electric Power Systems Research, 130, 139-147.
[83] Barak, S., &Sadegh, S. S. (2016).Forecasting energy consumption using ensemble ARIMA–ANFIS hybrid algorithm. International Journal of Electrical Power & Energy Systems, 82, 92-104.
[84] Zhou, H. C., Peng, Y., & Liang, G. H. (2008). The research of monthly discharge predictor-corrector model based on wavelet decomposition. Water resources management, 22(2), 217-227.
[85] Sun, T., Zhang, T., Teng, Y., Chen, Z., & Fang, J. (2019). Monthly Electricity Consumption Forecasting Method Based on X12 and STL Decomposition Model in an Integrated Energy System. Mathematical Problems in Engineering, 2019.
[86] Nury, A. H., Hasan, K., &Alam, M. J. B. (2017). Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature Time Series Data in north eastern Bangladesh, Journal of King Saud University-Science, 29(1), 47–61.
[87] Pannakkong, W., & Huynh, V. N. (2017, October). A Hybrid Model of ARIMA and ANN with Discrete Wavelet Transform for Time Series Forecasting. In International
147
Conference on Mod eling Decisions for Artificial Intelligence (pp. 159-169).Springer, Cham.
[88] Vasilakis, G. A., Theofilatos, K. A., Georgopoulos, E. F., Karathanasopoulos, A., & Likothanassis, S. D. (2013). A genetic programming approach for EUR/USD exchange rate forecasting and trading. Computational economics, 42(4), 415-431.
[89] Hajirahimi, Z., & Khashei, M. (2019). Hybrid structures in time series modeling and forecasting: A review. Engineering Applications of Artificial Intelligence, 86, 83-106.
[90] Saab, S., Badr, E., & Nasr, G. (2001). Univariate modeling and forecasting of energy consumption: the case of electricity in Lebanon. Energy, 26(1), 1-14.
[91] de Oliveira, E.M. and Oliveira, F.L.C., 2018. Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods. Energy, 144, pp.776-788.
[92] Nakanishi, I., Nishiguchi, N., Itoh, Y., & Fukui, Y. (2005). On‐line signature verification based on subband decomposition by DWT and adaptive signal processing. Electronics and Communications in Japan (Part III: Fundamental Electronic Science), 88(6), 1-11.
[93] van der Meer, D.W., Shepero, M., Svensson, A., Widén, J. and Munkhammar, J., 2018. Probabilistic forecasting of electricity consumption, photovoltaic power generation and net demand of an individual building using Gaussian Processes. Applied energy, 213, pp.195-207.
[94] Xiao, L., Shao, W., Yu, M., Ma, J. and Jin, C., 2017. Research and application of a hybrid wavelet neural network model with the improved cuckoo search algorithm for electrical power system forecasting. Applied Energy, 198, pp.203-222.
[95] Tealab, A., Hefny, H. and Badr, A., 2017. Forecasting of nonlinear time series using ANN. Future Computing and Informatics Journal, 2(1), pp.39-47.
[96] Suganthi, L. and Samuel, A.A., 2012. Energy models for demand forecasting—A review. Renewable and sustainable energy reviews, 16(2), pp.1223-1240.
[97] Feinberg, E.A. and Genethliou, D., 2005. Load forecasting. In Applied mathematics for restructured electric power systems (pp. 269-285). Springer, Boston, MA.
[98] Zhang, H., Zhang, S., Wang, P., Qin, Y. and Wang, H., 2017. Forecasting of particulate matter time series using wavelet analysis and wavelet-ARMA/ARIMA model in Taiyuan, China. Journal of the Air & Waste Management Association, 67(7), pp.776-788.
[99] Kriechbaumer, T., Angus, A., Parsons, D. and Casado, M.R., 2014. An improved wavelet–ARIMA approach for forecasting metal prices. Resources Policy, 39, pp.32-41.
[100] Li, D., 2018. Transforming time series for efficient and accurate classification (Doctoral dissertation, University of Luxembourg, Luxembourg, Luxembourg).
[101] Rafiei, M., Niknam, T., Aghaei, J., Shafie-Khah, M., & Catalão, J. P. (2018). Probabilistic load forecasting using an improved wavelet neural network trained by generalized extreme learning machine. IEEE Transactions on Smart Grid, 9(6), 6961-6971.
[102] Khandelwal, I., Satija, U. and Adhikari, R., 2015, July. Efficient financial time series forecasting model using DWT decomposition. In 2015 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT) (pp. 1-5). IEEE.
148
[103] Shumway, R.H. and Stoffer, D.S., 2017. Time series analysis and its applications: with R examples. Springer.
[104] Taieb SB, Huser R, Hyndman RJ, Genton MG. Forecasting uncertainty in electricity smart meter data by boosting additive quantile regression. IEEE Trans Smart Grid 2016;7(5):2448–55.
[105] Weron, R., 2007. Modeling and forecasting electricity loads and prices: A statistical approach (Vol. 403). John Wiley & Sons.
[106] Jin, J. and Kim, J., 2015. Forecasting natural gas prices using wavelets, time series, and artificial neural networks. PloS one, 10(11), p.e0142064.
[107] Yousefi, S., Weinreich, I. and Reinarz, D., 2005. Wavelet-based prediction of oil prices. Chaos, Solitons & Fractals, 25(2), pp.265-275.
[108] Tan, Z., Zhang, J., Wang, J. and Xu, J., 2010. Day-ahead electricity price forecasting using wavelet transform combined with ARIMA and GARCH models. Applied energy, 87(11), pp.3606-3610.
[109] Moazzami, M., Khodabakhshian, A. and Hooshmand, R., 2013. A new hybrid day-ahead peak load forecasting method for Iran’s National Grid. Applied Energy, 101, pp.489-501.
[110] Mellit, A., Benghanem, M. and Kalogirou, S.A., 2006. An adaptive wavelet-network model for forecasting daily total solar-radiation. Applied Energy, 83(7), pp.705-722.
[111] Liu, H., Tian, H.Q., Pan, D.F. and Li, Y.F., 2013. Forecasting models for wind speed using wavelet, wavelet packet, time series and Artificial Neural Networks. Applied Energy, 107, pp.191-208.
[112] Soltani, S., 2002. On the use of the wavelet decomposition for time series prediction. Neurocomputing, 48(1-4), pp.267-277.
[113] Ahmad, S., Popoola, A. and Ahmad, K., 2005. Wavelet-based multiresolution forecasting. University of Surrey, Technical Report.
[114] Shafie-Khah, M., Moghaddam, M.P. and Sheikh-El-Eslami, M.K., 2011. Price forecasting of day-ahead electricity markets using a hybrid forecast method. Energy Conversion and Management, 52(5), pp.2165-2169.
[115] Pindoriya, N.M., Singh, S.N. and Singh, S.K., 2008. An adaptive wavelet neural network-based energy price forecasting in electricity markets. IEEE Transactions On power systems, 23(3), pp.1423-1432.
[116] Bianchi, F.M., De Santis, E., Rizzi, A. and Sadeghian, A., 2015. Short-term electric load forecasting using echo state networks and PCA decomposition. Ieee Access, 3, pp.1931-1943.
[117] Gholipour Khajeh, M., Maleki, A., Rosen, M.A. and Ahmadi, M.H., 2018. Electricity price forecasting using neural networks with an improved iterative training algorithm. International Journal of Ambient Energy, 39(2), pp.147-158.
[118] Li, H., Li, Y. and Dong, H., 2017. A Comprehensive Learning-Based Model for Power Load Forecasting in Smart Grid. Computing and Informatics, 36(2), pp.470-492.
[119] Hernández, L., Baladrón, C., Aguiar, J.M., Calavia, L., Carro, B., Sánchez-Esguevillas, A., Pérez, F., Fernández, Á. and Lloret, J., 2014. Artificial neural network for short-term load forecasting in distribution systems. Energies, 7(3), pp.1576-1598.
[120] Hong, T., Wilson, J. and Xie, J., 2014. Long term probabilistic load forecasting and normalization with hourly information. IEEE Transactions on Smart Grid, 5(1), pp.456-462.
[121] Khairalla, M.A., Ning, X., AL-Jallad, N.T. and El-Faroug, M.O., 2018. Short-Term
149
Forecasting for Energy Consumption through Stacking Heterogeneous Ensemble Learning Model. Energies, 11(6), pp.1-21.
[122] Khan, G.M. and Arshad, R., 2016. Electricity Peak Load Forecasting using CGP based Neuro Evolutionary Techniques. International Journal of Computational Intelligence Systems, 9(2), pp.376-395.