NOVEL HYBRID ELECTRIC LOAD FORECASTING MODEL USING …

NOVEL HYBRID ELECTRIC LOAD FORECASTING

MODEL USING ARIMA MODEL AND DISCRETE

WAVELET TRANSFORM

THESIS

Submitted

in fulfilment of the requirements of the degree of

DOCTOR OF PHILOSOPHY

By

Harveen Kaur

1410931001

Supervised by

Dr. Sachin Ahuja

Professor & Director | Research

Department of Computer Science and Engineering

CHITKARA UNIVERSITY

CHANDIGARH-PATIALA NATIONAL HIGHWAY

RAJPURA (PATIALA) PUNJAB-140401 (INDIA)

February 2021

i

DECLARATION BY STUDENT

I hereby certify that the work which is being presented in this thesis entitled

“Novel hybrid electric load forecasting model using ARIMA model and

Discrete Wavelet Transform” is for fulfilment of the requirement for the

award of Degree of Doctor of Philosophy submitted in the Department of

Computer Science and Engineering, Chitkara University, Punjab

The work has not formed the basis for the award of any other degree or

diploma, in this or any other Institution or University. In keeping with the

ethical practice in reporting scientific information, due acknowledgements have

been made wherever the findings of others have been cited.

Harveen Kaur

ii

CERTIFICATE BY SUPERVISOR

This is to certify that the thesis entitled “Novel hybrid electric load

forecasting model using ARIMA model and Discrete Wavelet Transform”

submitted by Harveen Kaur to the Chitkara University, Punjab in fulfilment

for the award of the degree of Doctor of Philosophy is a bona fide record of

research work carried out by her under my supervision. The contents of this

thesis, in full or in parts, have not been submitted to any other Institution or

University for the award of any degree or diploma.

Dr. Sachin Ahuja

Professor & Director | Research

Chitkara University, Punjab

iii

ACKNOWLEDGEMENT

I wish to thank all of those who made contributions in a variety of forms over the prolonged

period of researching and writing this thesis.

It gives me an immense pleasure in expressing my deepest gratitude towards my research

supervisor Dr. Sachin Ahuja, Professor and Director (Research), Chitkara University,

Punjab, India for his warm encouragement and thoughtful guidance. I am thankful to my

supervisor for all his contributions of time and ideas to make my Ph.D experience productive

and stimulating.

I am highly indebted to Mr. Arun Kumar Gupta, E-in-Chief, Planning, PSPCL, Patiala and

his associated staff for arranging and providing me the real energy consumption data of the

Punjab State and providing valuable ideas in undergoing my research work.

My special thanks to the internal and external examiners who provided valuable fresh insight

into the work and helped to make it more conclusive research.

My deepest gratitude goes to my father Mr. Surinder Pal Singh, Executive Engineer, Office

of Chief Electrical Inspector, Punjab Govt., who has been constantly supporting me for

making my research work more effective and my mother Mrs. Palvinder Kaur for her

unflagging love and being the constant source of inspiration and encouragement throughout

my personal journey that is part of this long research process. Ashmeet Kaur, my sister

whose encouragement, and constant support for me to reach this goal has been the greatest

gift.

Above all, I am grateful to God, Almighty who sustains this beautiful world and without

whose grace nobody can ever succeed.

HARVEEN KAUR

iv

ABSTRACT

Energy is the first and foremost part of the socio-economic and political world in which we

live. The most important of the various forms of energy is electricity. There is always a gap

between the supply and demand for the electric energy. To meet the ever increasing demand

of the electricity consumption there is a dire need for an accurate prediction model that can

prove useful. In the present work, electricity consumption forecasting model is designed for

the State of Punjab, India, in which the dynamic relationship among the time series entities is

explored. The time-series data comprises a variety of information in their samples consisting

of both linear and nonlinear data. Based on the type of Time Series Data, models such as

linear and nonlinear can be applied. In this work, the direct model techniques used are Auto-

Regressive Integrated Moving Average (ARIMA), whereas nonlinear model techniques used

are optimization algorithms, i.e., Cuckoo Search (CS), and Artificial Neural Network (ANN).

To achieve optimum accuracy, instead of using these techniques individually, a hybrid model

has been developed which was further applied on the Time Series Data of electricity

consumption in Punjab. Initially, the data is decomposed into two levels using the Discrete

Wavelet Transform (DWT) method. DWT is used to decompose the Punjab State Power

Corporation Limited (PSPCL) data into two parts based on electricity consumption, which

helps to determine the highest and the lowest electricity consumption. On each decomposed

data ARIMA technique is applied individually to obtain a Time Series Data. Then, Inverse

Discrete Wavelet Transform (IDWT) is applied to combine the data, which is further

optimized using a nature-inspired CSA technique. The Artificial Neural Network (ANN)

algorithm is used to train the designed model by passing the optimized data to its input layers,

which helps for the prediction of electricity consumption in the future. After applying

ARIMA, the accuracy of the forecasting model is 83.53% and ARIMA with DWT technique

there is an increase of 10.23% in accuracy whereas utilizing ARIMA with the hybrid model

of DWT, CSA and ANN there is an increase of 15.33 % in accuracy. So, the performance of

the proposed hybrid strategy, i.e., ARIMA, DWT, CSA and ANN is better than both existing

forecasting models i.e. ARIMA model and ARIMA with DWT model.

v

SUMMARY

This complete research is comprised of 6 chapters, which by large emphasis the working

functionality and gives a complete explanation of the research work and implementation.

Chapter 1: The introduction chapter consists of the basic terminology of research topic by

highlighting the concepts and the process of enhancing the methods of implementation for the

research work on Novel hybrid electric load forecasting model using ARIMA model and

Discrete Wavelet Transform.

Chapter 2: The literature review chapter provides the base of knowledge on the topic and

helps to identify gaps in research and includes the study by the other authors who have done

the work on same areas.

Chapter 3: The methodology chapter provides the proposed hybrid forecasting model. In this

chapter the proposed methodology is designed and developed.

Chapter 4: The Proposed work chapter comprises of the algorithms used to develop the

proposed hybrid forecasting model. The specific objectives are as follows:

1. To model the time series data and forecast future values using ARIMA.

2. To determine how discrete wavelet transform, resolve the difficulties in ARIMA

modeling.

3. To develop a hybrid model of ARIMA and wavelet transform.

4. To analyze the performance and accuracy of the proposed algorithm.

Chapter 5: This chapter discusses about results and the simulation of the research work done

using MATLAB simulator to forecast the electricity consumption by using different

techniques.

Chapter 6: This chapter is about conclusion and future scope. The research focuses on

design and development of a novel intelligent technique, which can be used to study the

behaviour of electricity consumption on the basis of time series data.

vi

TABLE OF CONTENTS

DECLARATION BY STUDENT ............................................................................................ i

CERTIFICATE BY SUPERVISOR ........................................................................................ ii

ACKNOWLEDGEMENT ..................................................................................................... iii

ABSTRACT .............................................................................................................................. iv

SUMMARY ............................................................................................................................... v

TABLE OF CONTENTS ......................................................................................................... vi

LIST OF FIGURES ................................................................................................................... x

LIST OF TABLES ................................................................................................................... xv

ABBREVIATIONS .............................................................................................................. xvii

Chapter 1: INTRODUCTION ................................................................................................ 1

1.1 BACKGROUND OF THE RESEARCH ......................................................................... 1

1.2 TIME SERIES DATA ...................................................................................................... 2

1.2.1 Linear vs. Non-linear ................................................................................................ 2

1.2.2 Periodic vs. Non-periodic ......................................................................................... 4

1.2.3 Gaussian and Non-Gaussian ..................................................................................... 5

1.2.4 Low Volatile and High volatile................................................................................. 6

1.2.5 Advantages of Electricity prediction ........................................................................ 9

1.3 TIME SERIES DATA PREDICTION MODELS .......................................................... 10

1.3.1 Auto-Regressive Integrated Moving Average (ARIMA) Model ............................ 10

1.3.2 Cuckoo Search Algorithm....................................................................................... 11

1.3.3 Cuckoos behavior for egg-laying ............................................................................ 12

1.3.4 Artificial Neural Networks (ANNs)........................................................................ 16

1.4 DECOMPOSITION BASED PREDICTION MODELS................................................ 19

vii

1.4.1 Decomposition based on Moving average (MA) .................................................... 20

1.4.2 Discrete Wavelet Transform (DWT) ...................................................................... 21

1.4.3 Trend-ARIMA model ............................................................................................. 22

1.4.4 Wavelet-ARIMA model.......................................................................................... 22

1.5 CLASSIFICATION TECHNIQUES .............................................................................. 24

1.5.1 Supervised classification approach ......................................................................... 24

1.5.2 Unsupervised learning ............................................................................................ 25

1.5.3 Semi-supervised classification approach ................................................................ 26

1.6 PROBLEM STATEMENT ............................................................................................. 26

1.7 RESEARCH OBJECTIVES ........................................................................................... 27

1.8 RESEARCH GAPS ........................................................................................................ 28

Chapter 2: PRESENT STATE OF ART ............................................................................... 29

2.1 HISTORY OF ELECTRICITY SUPPLY ...................................................................... 29

2.2 PUNJAB STATE POWER CORPORATION LIMITED (PSPCL) ............................... 30

2.3 OVERVIEW ON LOAD FORECASTING .................................................................... 30

2.3.1 Type of Load Forecasting Technique ..................................................................... 31

2.4 CLASSIFICATION OF LOAD FORECASTING ......................................................... 36

2.4.1 Trending methods ................................................................................................... 37

2.5 LOAD FORECASTING METHODS............................................................................. 37

2.5.1 Literature review of studies using ARIMA or its Hybrid models for forcasting .... 39

2.5.2 ANN and ARIMA ................................................................................................... 41

2.5.3 Wavelet Decomposition based ARIMA and ANN ................................................. 45

Chapter 3: METHODOLOGY .............................................................................................. 47

3.1 METHODOLOGY OF THE PROPOSED HYBRID FORECASTING MODEL ......... 47

3.2 PROGRAMMING LANGUAGE ................................................................................... 49

Chapter 4: PROPOSED WORK ........................................................................................... 50

viii

4.1 ALGORITHMS USED TO DEVELOP THE PROPOSED HYBRID FORECASTING

MODEL ................................................................................................................................ 50

4.1.1 ARIMA model ........................................................................................................ 50

4.1.2 DISCRETE WAVELET TRANSFORM (DWT) ................................................... 52

4.1.3 CUCKOO SEARCH OPTIMIZATION ALGORITHM ........................................ 54

4.1.4 ARTIFICIAL NEURAL NETWORK .................................................................... 56

4.2 TO MODEL THE TIME SERIES DATA AND FORECAST FUTURE VALUES

USING ARIMA .................................................................................................................... 58

4.3 TO DETERMINE HOW DISCRETE WAVELET TRANSFORM RESOLVES THE

DIFFICULTIES IN ARIMA MODELING .......................................................................... 61

4.3.1 Difficulties that were resolved by combining ARIMA model with DWT ............. 61

4.3.2 Use of DWT in proposed hybrid model .................................................................. 63

4.3.3 Use of ARIMA model in proposed work ................................................................ 64

4.3.4 DWT and ARIMA Model ....................................................................................... 65

4.3.5 HAAR Wavelet ....................................................................................................... 67

4.3.6 The requirement of DWT in time series forecasting .............................................. 68

4.4 TO DEVELOP A HYBRID MODEL OF ARIMA AND WAVELET TRANSFORM . 71

4.5 TO ANALYZE THE PERFORMANCE AND ACCURACY OF THE PROPOSED

ALGORITHM....................................................................................................................... 72

Chapter 5: RESULTS AND ANALYSIS ............................................................................. 74

5.1 DATASET USED ........................................................................................................... 75

5.2 RESULTS AND DISCUSSION ..................................................................................... 77

5.2.1 Prediction using ARIMA Model ........................................................................... 114

5.2.2 Prediction using ARIMA with DWT .................................................................... 120

5.2.3 Prediction using the Proposed Hybrid Model ....................................................... 125

5.2.4 Computed parameters ........................................................................................... 134

Chapter 6: CONCLUSION AND FUTURE SCOPE....................................................... 138

ix

6.1 CONCLUSION ............................................................................................................. 138

6.2 FUTURE SCOPE.......................................................................................................... 140

RESEARCH PUBLICATIONS ........................................................................................... 141

REFERENCES ....................................................................................................................... 142

x

LIST OF FIGURES

Figure 1:1 Child's Height vs. Age (Linear Time Series Data) [5] ............................................. 3

Figure 1:2 Sea Temperature vs. year (non-linear) ..................................................................... 3

Figure 1:3 An example of Periodic TSD [7] .............................................................................. 4

Figure 1:4 Stock Market Prediction as non-periodic Time Series Data .................................... 5

Figure 1:5 Gaussian distribution example ................................................................................. 5

Figure 1:6 Non- Gaussian Distribution example ....................................................................... 6

Figure 1:7 Low Volatile Time Series Data ................................................................................ 6

Figure 1:8 Highly volatile Time Series Data ............................................................................. 7

Figure 1:9 Cuckoo bird ............................................................................................................ 12

Figure 1:10 Representation of a nest solution in the Cuckoo search algorithm ...................... 13

Figure 1:11 Flow chart of Cuckoo search optimization algorithm [21] .................................. 14

Figure 1:12 Model of ANN [25] .............................................................................................. 17

Figure 1:13 layered architecture of Feedforward network [26] ............................................... 18

Figure 1:14 Feedback network [26] ......................................................................................... 19

Figure 1:15 Time Series decomposition using MA Filter [29] ................................................ 20

Figure 1:16 Wavelet decomposition [33] ................................................................................ 23

Figure 1:17 Supervised learning [36] ....................................................................................... 24

Figure 1:18 Supervised learning process [37] ......................................................................... 25

Figure 1:19 Unsupervised learning [38] .................................................................................. 25

Figure 2:1 Spatial electric load forecasting methods ............................................................... 36

Figure 3.3:1 Proposed Work .................................................................................................... 48

Figure 4.4:1 Horizontal Wavelet transforms [93] .................................................................... 53

Figure 4. 4:2 Vertical Wavelet transforms for horizontal wavelets [93] ................................. 53

Figure 4. 4:3 1, 2 and 3-level Discrete Wavelet Decompositions [93] ................................... 54

xi

Figure 4.4:4 (a) Stationary and (b) Non-stationary series........................................................ 59

Figure 4.4:5 DWT and ARIMA Model ................................................................................... 65

Figure 4.4:6 Discrete Wavelet Transform Output ................................................................... 66

Figure 5:1 Dataset consumed electricity from January 2013 to December 2017 .................... 74

Figure 5:2 Data panel ............................................................................................................... 78

Figure 5:3 Upload Data ............................................................................................................ 79

Figure 5:4 Convert to stationary .............................................................................................. 80

Figure 5:5 Generated hypothesis ............................................................................................. 81

Figure 5:6 Original consumed and predicted electricity using ARIMA .................................. 82

Figure 5:7 Next Predicted electricity using ARIMA ............................................................... 83

Figure 5:8 Data panel of DWT - ARIMA ................................................................................ 83

Figure 5:9 Data uploading again .............................................................................................. 84

Figure 5:10 Decomposition of data using Haar wavelet .......................................................... 85

Figure 5:11 Decomposed (LL) data converted into stationary ................................................ 86

Figure 5:12 Generated hypothesis of LL decomposed datasets............................................... 87

Figure 5:13 Original consumed and predicted electricity using LL ARIMA .......................... 88

Figure 5:14 Decomposed (LH) data converted into stationary ................................................ 88

Figure 5:15 Generated hypothesis of LH decomposed datasets .............................................. 89

Figure 5:16 Original consumed and predicted electricity using LH ARIMA ......................... 90

Figure 5:17 Decomposed (HL) data converted into stationary ................................................ 90

Figure 5:18 Generated hypothesis of HL decomposed datasets .............................................. 91

Figure 5:19 Original consumed and predicted electricity using HL ARIMA ......................... 92

Figure 5:20 Decomposed (HH) data converted into stationary ............................................... 92

Figure 5:21 Generated hypothesis of HL decomposed datasets .............................................. 93

Figure 5:22 Original consumed and predicted electricity using HH ARIMA ......................... 94

Figure 5:23 Inverse-DWT ........................................................................................................ 94

xii

Figure 5:24 Data Panel for Daubechies Wavelet ..................................................................... 95

Figure 5:25 Segmentation done using Daubechies Wavelet Transform .................................. 96

Figure 5:26 Differencing Applied on non-stationary data using Daubechies for LL .............. 96

Figure 5:27 Differencing Applied on non-stationary data using Daubechies

for LH....................................................................................................................................... 97

Figure 5:28 Differencing Applied on non-stationary data using Daubechies for

HL ............................................................................................................................................ 97

Figure 5:29 Differencing Applied on non-stationary data using Daubechies for HH ............. 98

Figure 5:30 LL-Autocorrelation plot for Daubechies .............................................................. 99

Figure 5:31 LH-Autocorrelation plot for Daubechies ........................................................... 100

Figure 5:32 HL-Autocorrelation plot for Daubechies ........................................................... 101

Figure 5:33 HH-Autocorrelation plot for Daubechies ........................................................... 101

Figure 5:34 ARIMA applied on LL segment of Daubechies for segment-based predictions 102

Figure 5:35 ARIMA applied on LH segment of Daubechies for segment-based predictions

................................................................................................................................................ 102

Figure 5:36 ARIMA applied on HL segment of Daubechies for segment-based predictions

................................................................................................................................................ 103

Figure 5:37ARIMA applied on HH segment of Daubechies for segment-based predictions 103

Figure 5:38 Segmentation of Time Series Data using HAAR decomposition fed to cuckoo

search and further to NN ........................................................................................................ 104

Figure 5:39 Training Structure of ANN................................................................................. 106

Figure 5:40 Performance........................................................................................................ 107

Figure 5:41Training State ...................................................................................................... 108

Figure 5:42 Regression .......................................................................................................... 109

Figure 5:43 Differencing Applied on non-stationary LL data segment of Cuckoo-NN ........ 110

Figure 5:44 Differencing Applied on non-stationary LH data segment of Cuckoo-NN ....... 110

Figure 5:45 Differencing Applied on non-stationary HL data segment of Cuckoo-NN ....... 111

xiii

Figure 5:46 Differencing Applied on non-stationary HH data segment of Cuckoo-NN ....... 112

Figure 5:47 ARIMA applied on Cuckoo-NN optimized LL segment for segment based

predictions .............................................................................................................................. 112

Figure 5:48 ARIMA applied on Cuckoo-NN optimized LH segment for segment based

predictions .............................................................................................................................. 113

Figure 5:49 ARIMA applied on Cuckoo-NN optimized HL segment for segment-based

predictions .............................................................................................................................. 113

Figure 5:50 ARIMA applied on Cuckoo-NN optimized HH segment for segment based

predictions .............................................................................................................................. 114

Figure 5:51 Electricity consumption Actual and Predicted using ARIMA of the Year 2013115

Figure 5:52 Electricity consumption original and Predicted using ARIMA of the Year 2014

................................................................................................................................................ 116

Figure 5:53 Electricity consumption actual and predicted using ARIMA for Year 2015 .... 117

Figure 5:54 Electricity consumption Actual and Predicted using ARIMA for year 2016 ..... 118


................................................................................................................................................ 119

Figure 5:56 Electricity consumption original and Predicted using ARIMA with DWT of the

Year 2013 ............................................................................................................................... 120

Figure 5:57 Electricity consumption Actual and Predicted using ARIMA with DWT for the

Year 2014 ............................................................................................................................... 121


Year 2015 ............................................................................................................................... 123


Year 2016 ............................................................................................................................... 124

Figure 5:60 Electricity consumption original and Predicted using ARIMA with DWT for the

Year 2017 ............................................................................................................................... 125

Figure 5:61 Actual and Predicted values of electricity consumption using Proposed Hybrid

Model for the Year 2013. ....................................................................................................... 127

xiv

Figure 5:62 Actual and Predicted values of electricity consumption using Proposed Hybrid

Model for the Year 2014. ....................................................................................................... 128

Figure 5:63 Electricity consumption Actual and Predicted using Proposed Hybrid Model for

the Year 2015 ......................................................................................................................... 129

Figure 5:64 Electricity consumption Actual and Predicted using Proposed Hybrid Model for

the Year 2016 ......................................................................................................................... 130

Figure 5:65 Electricity consumption original and Predicted using Proposed Hybrid Model of

the Year 2017 ......................................................................................................................... 131

Figure 5:66 Overall comparison of electricity consumption prediction of ARIMA +DWT,

Hybrid proposed model with the original dataset .................................................................. 132

Figure 5:67 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model

................................................................................................................................................ 135

Figure 5:68 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model

................................................................................................................................................ 135

Figure 5:69 Computed Accuracy (%) for ARIMA, ARIMA with DWT, ARIMA with HAAR

and Proposed Hybrid Model .................................................................................................. 136

xv

LIST OF TABLES

Table 5.1 Dataset Used……………………………………………………………………….75

Table 5.2 Original and predicted electricity consumption using ARIMA ............................. 114

Table 5.3 Actual and predicted electricity consumption using ARIMA ............................... 116

Table 5.4 Original and predicted electricity consumption using ARIMA of the year 2015 .. 117

Table 5.5 Original and predicted electricity consumption using ARIMA for year 2016 ...... 118

Table 5.6 Actual and predicted electricity consumption using ARIMA of the year 2017 .... 119

Table 5.7 Original and predicted electricity consumption using ARIMA with DWT for year

2013........................................................................................................................................ 120

Table 5.8 Actual and predicted electricity consumption using ARIMA and DWT for year

2014........................................................................................................................................ 121

Table 5.9 Actual and predicted electricity consumption using ARIMA with DWT of the year

2015........................................................................................................................................ 122

Table 5.10 Actual and predicted electricity consumption using ARIMA with DWT for year

2016........................................................................................................................................ 123

Table 5.11 Actual and Predicted electricity consumption using ARIMA with DWT for the

year 2017 ................................................................................................................................ 125

Table 5.12 Actual and predicted electricity consumption using Proposed Hybrid Model for

the year 2013 .......................................................................................................................... 126

Table 5.13 Original and predicted electricity consumption using Proposed Hybrid Model for

the year 2014 .......................................................................................................................... 127


the year 2015 .......................................................................................................................... 128


the year 2016 .......................................................................................................................... 130


the year 2017 .......................................................................................................................... 131

xvi

Table 5.17 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model134

Table 5.18 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model

................................................................................................................................................ 135

Table 5.19 Computed Accuracy for Different Combinations ................................................ 136

xvii

ABBREVIATIONS

GUI- Graphical User Interface

RAM- Random Access Memory

MATLAB- Matrix Laboratory

ARIMA - Auto Regressive Integrated Moving Average

DWT - Discrete Wavelet Transformation

CSA - Cuckoo Search Algorithm

ANN - Artificial Neural Network

1

CHAPTER 1: INTRODUCTION

Different organizations and research institutions use forecasting in their daily

routine. Prediction means estimating the future behavior of a variable based on its

experience. Due to the increase in the demand for electricity, forecasting the demand

and supply of electricity has special significance. Forecasts can be made based on

the relationship between electricity consumption and it's identified variables.

Various methods have been utilized for this purpose. In year 2000, World Energy

Outlook(WEO) predicted that the electricity-based projects consume half of the total

enrgy requirements in the world and electricity demand will grow at a rate of 5.4%

per year from 1997 to 2020 which is faster than the 4.9% growth rate assumed by

GDP of India. Generally, a highly positive correlation is observed between a

country's GDP growth and its consumption of electricity. This is a good indication

that the efficiency of electricity consumption is increasing. In India, this correlation

has declined. The elasticity of electricity consumption relative to GDP shows this,

and elasticity has declined over the years. During the first five-year plan period,

electricity consumption increased by 3.14%, while in the eighth plan period, it only

increased by 0.97% of GDP [1].

1.1 BACKGROUND OF THE RESEARCH

Energy is a foremost part of the socio-economic and political world in which we

live. In planning the future, we must pay full attention to the energy factor, because

the energy system affects the economic system very carefully stated by Hutberin

1984. Also, similar economic characteristics and the same energy scenarios show

that energy plays a crucial role in the economic development of developing

countries given by Ebohon in 1996 [2].

Indian Government has been making future plans every year for its development. In

all earlier projects, the authorities planned to minimize the gap between electricity

production and consumption. In the procedure to facilitate the creation and

utilization of the plan, the competent authorities must provide an energy

consumption estimate for the next upcoming years but none has mentioned any

2

methods for estimation/forecasting purposes. It turns out that there is always a

massive gap between their predictions and actual consumption. The Time Series

Data approach is utilized to guess the output that can be either volatile or non-

volatile [3]. The example for volatile and nonvolatile information includes wind

speed and environmental temperature respectively. The detailed description of Time

Series Data along with different types is discussed in the following section.

1.2 TIME SERIES DATA

Time Series Data is a series of numerical values that change over time and can be

represented in the form of a table containing a point at a particular instance, which

represents the Time Series Data value at that instance. Most of the time, it is

described in a rectangular plot, in which the X-axis represents time and the

Y-axis represents Time Series Data values. Some examples of time series data are

listed below:

1. Internet traffic per second is diverse, as it contains the number of bytes

transmitted per second.

2. The gold price is also a function of days as it is different every day and hence

is an example of Time Series Data.

3. Commodity cost as a function of day

4. The child growth (height versus age)

5. Number of babies born every second

6. Tree planted per day

Except for the examples as mentioned above, prediction of electricity consumption,

phone calls, temperature as well as the forecast of stock market data are also

examples of time series data. None of these time series have the same origin. Based

on the change in conditions, the Time Series Data also varies [4]. Some distinctive

taxonomies of Time Series Data values based upon their nature are explained below,

along with the description of the separate examples.

1.2.1 Linear vs. Non-linear

According to the feature values of the tested data, the Time Series Data can be

classified as linear/ nonlinear.

3

i. Linear

If the values of the variable changes linearly i.e. the Y-parameter changes with

respect to the X-parameters, it is known as linear Time Series Data. The example of

linear Time Series Data is shown in Figure 1.1.

Figure 1:1 Child's Height vs. Age (Linear Time Series Data) [5]

Figure 1.1 represents the graph between the heights of the child concerning its age

(1 to 10 years). The estimated and the examined height values are displayed by

dotted and the dark black line, respectively.

ii. Nonlinear

It is investigated that all Time Series are not linear. In fact, over a short duration,

many of the available Time Series can be roughly approximated linearly with

significant errors. An example of non-linear Time Series Data is Figure 1.2.

Figure 1:2 Sea Temperature vs. year (non-linear)

Figure source: https://www.epa.gov/climate-indicators/climate-change-indicators-sea-surface-temperature

4

Figure 1.2 represents the example of non-linear Time Series Data of sea

temperature concerning the number of years. The dotted lines represent the average

value, whereas the predicted values are described by the solid red color line [6].

1.2.2 Periodic vs. Non-periodic

As we know that the values of time series data are changing continuously as well as

in irregular intervals of times, the time series data can be categorized as periodic and

non-periodic.

i. Periodic

The example of periodic Time Series Data is represented in Figure 1.3. The graph

represents the graphical representation of sunspot numbers that recur at the regular

instant of time. The graph shows the seasonal dependencies in terms of Time Series

Data. Other examples of periodic data are road traffic at a specific time in a day;

climate changes appear as per the seasons [7].

Figure 1:3 An example of Periodic TSD [7]

ii. Non-periodic

The data which is not repeated at a particular instance of time is known as non-

periodic Time Series Data. Also, in Figures 1.1 and 1.2, the example of a non-

5

periodic Time Series Data signal is represented in terms of a graph. Also, financial,

as well as stock market data, comes under non-periodic Time Series Data [8].

Figure 1:4 Stock Market Prediction as non-periodic Time Series Data

Figure source: https://www.datacamp.com/community/tutorials/lstm-python-stock-market

The stock market price changes along with the date from the year 1970 to 2017 are

shown in Figure 1.4.

1.2.3 Gaussian and Non-Gaussian

The values of time series data that comprise of the standard/Gaussian distribution

curve are known as Gaussian Time Series Data, otherwise called Non-Gaussian

Time Series Data [9]. The example of Gaussian and Non-Gaussian distribution is

represented in Figure 1.5 and Figure 1.6, respectively.

Figure 1:5 Gaussian distribution example

6

Figure 1:6 Non- Gaussian Distribution example

Figure source: https://www.quora.com/What-is-an-example-of-a-dataset-with-a-non-Gaussian-distribtion

1.2.4 Low Volatile and High volatile

If the conditions change gradually concerning the time they are known as “Low

Volatile” Time Series Data; otherwise, known as “high volatile” Time Series Data.

The example of low and high volatile is shown in Figure 1.7 and Figure 1.8,

respectively.

Figure 1:7 Low Volatile Time Series Data

Figure source: https://www.equitymaster.com/indian-share-markets/11/10/2017/Sensex-Finishes-Marginally-

Higher-SBI-Surges-62

The crude oil price varies for the months of the years 2016 and 2017 as shown in

Figure 1.7.

7

Figure 1:8 Highly volatile Time Series Data

Figure source: https://economictimes.indiatimes.com/wealth/invest/7-stocks-with-high-1-year-upside-potential-

valued-on-basis-of-peg-ratio/articleshow/71551414.cms

The area of research in the prediction of Time Series Data in today's era becomes

essential due to its reliability of forecasting in enormous area of applications. Few of

them are discussed below:

The forecasting mechanism is helpful for internet service providers by which

they manage the available bandwidth and provides more effective solutions

to users.

In the agriculture sector, the prediction of variation in climate becomes

helpful e.g. to forecast the rainfall provides help to farmers to check if the

weather is suitable for farming or not.

According to the upcoming trends of marketing, marketers can easily decide

the perfect time and amount to invest for a specific purpose.

For the prediction of natural disasters such as tsunami, earthquakes, cyclones

and floods, etc. early prepration can be done to avoid the destruction [10].

There can be various prediction classes based on its usage in different areas, as

pointed below:

1. 1-step ahead prediction: In this type of prediction, it's compulsory to predict the

next single value based on presently available data; it must go to next after

completing the one cycle or horizon of prediction. The typical example of this type

of speculation can be a weekly background of prediction. By taking the Time Series

Data values to start from 1st day to 14th day, we can predict the 15th day consumed

electricity. Again by using the 2nd day to 16th day, we can predict the load

8

consumption of the 17th day. Finally, the total consumption of electricity in 7 days

can be predictable from day 14 to day 20 by using this process of one step prediction

by using the 7-day forecasting horizon.

2. N-step ahead forecast: In this type of speculation, N>1 means multiple values

are predicted using the currently accessible data, unlike the previously discussed

1-step ahead prediction. This prediction is also termed as “multi-step ahead” forecast

method and can be repeated after the completion of a horizon of speculation, and it

can be further categorized into

(a) Direct forecast: If we take N=2 and background for prediction is 5, by using the

Time Series Data values for the first and second day, the values for coming days can

be predicted directly. Similarly, by using Time Series Data values of the 3rd day up

to 7thday, the values for day 8th and 9th day can be predicted. This direct nature of

prediction is also known as “direct 2-step ahead” forecasting.

(b) Iterative forecasting: If N=2 and 5 has been considered as the prediction

horizon. Then we have to use the Time Series Data values of day 1 to day 5, firstly

Time Series Data value for day 6 is obtained. After that, utilizing the Time Series

Data values, starting from 2nd day to 5th day, and the amount that predicted for day 6

has been utilized to predict the value of day 7. Whereas, indirect forecasting to get

the value of day 7, the amount of day 6 is not required, and this process continues

throughout the given prediction horizon until the forecasted value for the 10th day is

generated. It can also be called as iterative 2-step ahead forecasting [11].

Components of Time Series

The analysis of time series produces a list of the scheme by which understanding

about datasets becomes better. Time series can be decomposed into these given four

parts.

Level: It represents the value of the baseline if the series is in a straight line.

Trend: This component is optional, by which the increasing and decreasing

behavior of series has been represented.

Seasonality: The discretionary way of repeating the behavior with respect to

time.

9

Noise: The variability that is optional comes in analysis, which cannot be

explained through the model. Among these components, the time series

involves necessarily is level, and these three, such as trend, seasonality, and

noise, are optional [12].

1.2.5 Advantages of Electricity prediction

Present research work focuses on the prediction of expected consumption of

electricity in the future. There are various benefits of the research, among them some

significant advantages are provided below:

i. Power utility companies have better understanding of the load demand or

electricity consumption in the future by which they become able to plan in a

better manner for future.

ii. The risk has been raising for energy companies. Knowing the possible long-

term load helps the businesses plan and make economically feasible

decisions about future demand as well as transmission in the region.

iii. Helpful in determining what resources are required, such as the necessary

fuels to run the generating plants as well as other resources needed to ensure

reliable and economical power generation and distribution to customers. This

is critical when it comes to planning small, moderate, and long data.

iv. Load forecasting is also helpful in future planning concerning the scale,

location, and design of the potential project to be produced. The utilities are

more likely to produce electricity close to the cost when they consider places

or regions with high or decreased demand. It minimizes the infrastructures

for transmission and distribution and the resulting losses.

v. Provide help in the decision making as well as planning of power system

maintenance. Through knowing the market, the company will know when to

conduct the support to make sure it has the least impact on the customers.

For example, during the day when most people are at work, and the demand

is deficient, they may want to do maintenance in residential areas.

10

1.3 TIME SERIES DATA PREDICTION MODELS

The prediction models are required to perform prediction or forecasting for any

provided Time Series Data. Among various Time Series Data prediction models,

some of them used in this present work are; ARIMA and ANN.

1.3.1 Auto-Regressive Integrated Moving Average (ARIMA) Model

The forecasting of time series is a scientific tool utilized to resolve the prediction

problems. Implementation of this model is easy and flexible as it needs only past

observation of the required variables. This model is a linear modeling scheme and a

combination of three components, such as AR, I, and MA which are briefly

described below:

Auto Regression (AR) is defined as the model that uses the observer-

dependent relationship and a few lagged observation numbers.

Integrated (I) to compute the differencing of raw observations, for example,

subtraction of representation from the previous view to make the stationary

time series.

Moving Average (MA) utilizes the dependency among observation and a

residual error of a moving average model that is applicable for lagged

observations.

ARIMA model was firstly deployed in 1976 by Box and Jenkin by using MATLAB

R 2012a. Model was proved useful for the preparation of data and computation of

autocorrelation function(ACF) and partial autocorrelation function(PACF). The

above three main components of the ARIMA model are clearly explained in terms of

the parameter of this model. A classical notation used for ARIMA is

ARIMA (p, d, q) in which the parameters are assigned with integral values, and the

notation used is illustrated below:

p: represents the observed values for lag involved in the prediction model.

d: denotes the amount of time obtained after the difference between raw

views, also known as“degree of variation”.

q: is the moving window dimensions and is known as “degree of moving

average”.

11

The ARIMA model is an extended version of the ARMA model (used only for

stationary Time Series Data); it becomes stationary through utilizing finite

differentiation of data points. A mathematical explanation of this model by utilizing

lag polynomials is given as below;

or, (1.1)

, (1.2)

In which the integer’s are described below:

The parameter d is used to control the level of difference, usually .

If , then the model produces in the form of ARIMA (p,q). If the model

is in the form of ARIMA(p,0,0) it provides nothing, but AR(q) and ARIMA

(0,0,q) is the MA(q) model.

means , this model is known as a Random walk

that is useful in the case of non-stationary data [13].

1.3.2 Cuckoo Search Algorithm

Cuckoo Search (CS) is a nature-inspired optimization algorithm, which is used to

find either minimum or the maximum value of the related problem using an

appropriate selection of function known as the objective function. In this research,

CS is used to minimize the irregularities in the collected data obtained after DWT.

We may assume that CS is used to further refine data so that the accuracy of the

forecasting method for electricity use can be improved. This algorithm operates on

the cuckoo birds' breeding strategy [14].

Cuckoo search is a nature inspired technique, initially designed by Yang and Deb in

2009. The concepts of CS are based on the bird named Cuckoo (figure 1.9). Due to

its beautiful sound as well as its aggressive reproduction scheme, cuckoos are

fascinating birds in which adult cuckoos lay their eggs in the nests of other species

or host birds.

12

Figure 1:9 Cuckoo bird

Image source: https://www.slideshare.net/AnujaJoshi6/cuckoo-o-ptimization-ppt

The concepts of cuckoos exist mainly in two forms i.e. adult cuckoos and eggs.

After the fertilization of eggs, a group is formed. Environmental characteristics,

together with the migration of organizations or cuckoo’s family, help to unite and

obtain the best environment for reproduction as well as for survival.

The basic of CS is based on three essential standardized rules;

1. Only one egg is laid by cuckoo at a time.

2. The nest which contains better quality eggs is contributed to produce the next

generation.

3. The host's nest is fixed and the laid eggs are discovered by the cuckoo using

probability Pa [0,1].

Based upon the three defined rules, the eggs are either thrown away from the nest or

the nest is abandoned and a new nest is formed [15].

1.3.3 Cuckoos behavior for egg-laying

In some other host bird's nests within the given Egg Laying Radius(ELR) each

cuckoo is initialized to lay eggs randomly. After having laid all the eggs in the host

bird's nests, some of the eggs which are less close in appearance to the eggs of the

host bird are thrown out of the nest after being detected by host birds. With this,

mostly 10% of eggs are killed after each egg-laying cycle. These eggs are gone, and

there is no way they will grow[16]. When the eggs hatch then the cuckoos eat major

portion of the food the host bird is taking to the nest. After a while the host birds'

chicks starve to death, leaving only the cuckoo chicks in the nest.

13

Figure 1:10 Representation of a nest solution in the Cuckoo search algorithm

Image source: https://www.scielo.org.za/scielo.php?script=sci_arttext&pid=S2224-78902013000300017

Cuckoo Search has attracted a lot of attention due to its simplicity and ability to

solve several optimization problems with numerous applications around the world.

The development of cuckoo behavior mainly affects cuckoo Search, i.e. the

placement of colorful patterns that mimic other bird's nests. Every single egg in the

nest is a solution although essentially the cuckoo egg is a new solution. Using new

and better solutions (i.e. cuckoos), algorithm's main emphasis is on modifying the

worst solution within the slots [17].

Based on the above three laws of the cuckoo hunt, the likelihood is that the host bird

will be able to throw the eggs out of the nest or simply dig the particular nest and

then create a completely fresh nest again. An important problem of CS algorithm is

the application of Levy flights to create new solutions,

(1.3)

Here is either taken from the standard normal distribution with respect to zero

mean and the standard aberration for random walks or the Levy distribution for Levy

flights.

Cuckoo egg

Host birds egg

Host birdsnest

X1 X2 Xn-1 Xn

Next solution

14

Figure 1:11 Flow chart of Cuckoo search optimization algorithm [21]

Afterwards, a random population generation can also be associated with the

similarity between the eggs of the cuckoo and the eggs of the host. The phase size

Generation of N host nest and assign the position to

each nest

Evaluation of healthy function for each nest

Bring out Levy flight to get new nests position and

evaluate its fitness

Compare the fitness of new and old nest, and

If <

A fraction of the worst nests are replaced by new

nests randomly

Compare newly searched nest with worst discovered nest and save the best nest

If max iteration reached

Take the best nest as outcome

Yes

No

No

Yes

15

‘s’ then determines the distance a random walker can cover for a given number of

repetitions [18]. Two components i.e. local and global hare consists of the CS

algorithm. The former is designed to improve the best alternative through a guided

random walk, while the latter is intended to maintain population diversity through

Levy flights. The probability of changing ‘Pa’ regulates a balance between the two

stochastic search parts.

Pseudocode of Cuckoo Search Algorithm:

Begin Initialize the N number of nests as population. Calculate Nests. While (the Criteria of terminationis not achieved) Generate randomly a novel solution i among the best nests Select nests j from population If quality of i is superior than j, replace it with solution Replace abandon nest with randomly generated nests Calculate nests.

The CS algorithm begins with a random population initialization and random

solutions generated. In comparison to its fitness function, each solution or nest is

assessed appropriately to find the possibly best solution or nest [19].

For an iterative process, the criteria for termination consist primarily of the

predefined number of generations, the maximum time allowed, or inactivity at the

end of the process. The new solution is generated corresponds to every iteration by

arbitrary walking or charging flights to the best solution. If these solutions are

superior to the solutions chosen at random, they will be replaced by the population.

The dumping of the worse solutions or nests and replacing them with the randomly

generated solutions is performed in proportion to the given probability ‘Pa’ . To find

the best alternative, the entire population is reassessed, and this method lasts until

the end requirements are met. The reason for applying CS in the designed electricity

Consumption prediction model for Punjab is because of its high convergence speed

to reach the optimal solution [20].

16

1.3.4 Artificial Neural Networks (ANNs)

Artificial neural networks (ANNs) are flexible in computation and most applied

predictors that are applicable in various time series forecasting issues along with

improved performance. ANNs are used for forecasting in various application areas

related to engineering, social sciences, foreign exchange, stock problems,

economics, and so on. Some characteristics of ANNs that makes it attractive and

valuable in forecasting task are as follows:

(a) model is not based on traditional method

(b) network can be generalized easily means it's possible for ANN to accurately

infer the unseen part of a population even if noisy information is present in data

(c) forecasting model can easily estimate any continuous function to suitable

accuracy, and ANNs are non-linear, unlike the ARIMA model [21].

1.3.4.1 Time Series Data modeling using ANN

The model of ANN is highly influenced through the features of data; the most

suitable of ANN is used for forecasting, and modeling is termed as the single hidden

layer feed-forward network. Usually, the model is categorized through three layers

of network that are linked via acyclic connections [22]. The mathematical

representation of the input, i.e., yt-1,……..,yt-p and output (yt) of this network is

provided as;

, (1.7)

Here, wi,j in which i varies from 0 to p and j varies from 0 to q, is the parameter of

the model, also known as connections weights. The terms p and q show the number

of input and hidden nodes; the activation function is also utilized in different forms.

Activation feature modes are defined by the neuron condition within the network.

The activation function is often not present in the input layer, although the activation

function's job is to move the information to the hidden layer. The most suitable

activation function in the output layer is the linear function since the non-linear

activation induces distortion in the forecast data [23]. The activation functions used

in the hidden layers are logistic and hyperbolic functions that are used to transfer

function as shown in the equation below;

17

(1.8)

(1.9)

So the ANN performs a non-linear functional mapping by using previous data to get

the future value (output).

and the output, (1.10)

The ANN is created from hundreds of units, artificial neurons, also known as

processing elements, which are related to weights, and creates the neural structure

and are arranged in layers [24]. The mathematical model of ANN is shown below in

figure 1.12.

Figure 1:12 Model of ANN [25]

For processing the input data according to the weighted function, the processing unit

is used. It composes a mathematical equation which helps in balancing the input

along with output data. As shown in figure 1.13, which mainly includes the input

signal and weight value, has been multiplied which performs addition operation and

output for that specific neuron. The sigmoid function is the most used activation

function, which used to perform the weighted sum of input neurons [25]. The

obtained result is passed to the transfer function and the layer structure of ANN as

shown in figure 1.13.

W1

X1

Y Outpu

WX2

X

W

Weig

Inpu

18

Figure 1:13 layered architecture of Feedforward network [26]

i. Input layer: The input data like optimized Time Series Data is passed to this

layer. This layer mainly consists of a node that is not involved in the

modification of signal means the nodes only forward data to the next or the

hidden layer.

ii. Hidden layer: Multiple neurons are included in the hidden layer, and the

nodes of this layer modify the signal, so they are called active nodes. Every

active node is involved in an active part of modeling.

iii. Output layer: The modified neurons are obtained at this layer and represents

the achieved output.

The ways neurons are linked with each other have an impact on the operation of

ANN. The neurons received excitatory input or inhibitory input; it does not matter

neurons may be real or artificial. In the computation of addition operation, excitatory

input neuron is useful, and for the subtraction operation, inhibitory neurons are used

[26]. These three processes are the feedback mechanism having the connection

through a path from the output layers going back to the input layer, and the

Feedback architecture is shown in figure 1.14.

X1

X2

X3

X4

Y1

Y2

Input layer Hidden layer

Output layer

19

Figure 1:14 Feedback network [26]

1.3.4.2 Estimation of the model parameter

To evaluate the parameters of the prediction model, it's not necessary to include only

an individual to optimize and compute the parameters in ANN, using nonlinear

equations. Therefore, the parameter assessment is done through training algorithms

by considering a sequence of data or training data that are fed as input data to ANN.

ANN can tell what will be the output in the future, and the ANN has been trained

based on the data at which the generated error is minimum. The minimization of the

error is performed by changing the Mean squared error (MSE). The weights and bias

values continue to change during the training period until the Time Series Data

values are improved.

1.3.4.3 Model validation and prediction

The ANN training is performed in an iterative manner, which means as the number

of iteration increases, the amount of the MSE values gets reduced. Once the error of

validation have been calculated and validation of the model is completed, this model

is used for Time Series Data prediction [27].

1.4 DECOMPOSITION BASED PREDICTION MODELS

In Time Series Data prediction, pre-processing technique becomes helpful in

improving the efficiency of forecasting. There are several kinds of pre-processing

methods available, among them, the best suitable technology to decompose has been

X1

X2

X3

X4

Y1

Y2

Input layer Hidden layer Output layer

20

applied here. The term decomposition means the process of decomposing the

original Time Series Data data into multiple components. The splitting can be done

based on some behavior of each decomposition in which filtering is the most used

approach [28]. Here in this research work, MA filter based decomposition along

with the wavelet-based decomposition have been applied, which is described below.

1.4.1 Decomposition based on Moving average (MA)

Considering a non-seasonal time series data upon which decomposition is done

based on Moving average (MA), it could be categorized into averaged or smoothed

components known as the trend and noise or residual component known as a

detrended component.

Time Series Data

Decomposition criterion met?

Fix MA Filter of length m

Trend Component Residual Component

Subtractor

no

Yes

Decomposition using MA filter

Figure 1:15 Time Series decomposition using MA Filter [29]

This technique is represented in figure 1.15. The trend component , and noise

component as expressed in given below equation (1.11) and (1.12) correspondingly.

21

The length of the MA filter could be selected to satisfy a certain decomposition

scheme for a specific work [31].

(1.11)

(1.12)

1.4.2 Discrete Wavelet Transform (DWT)

DWT is the most commonly used decomposition tool that helps researchers to get

meaningful information in terms of time and frequency about the deep signal. Using

this approach, the issues of localization, which exist in the Fourier transform has

been resolved. It is a scientific tool that transmits the input to a different field for

signal processing and analysis. The data in terms of time and frequency domain is

appropriate for predicting time-unstable autocorrelation function and unstable

processes. The data related to weather or finance are both unstable as their values

changes over time [30]. To predict such kind of data DWT is an appropriate tool and

the wavelet function considered for a single entity is written in equation 1.13.

(1.13)

In Equation 1.13, ‘c’ denotes the scaling factor that calculates the minimum value of

the given data. The ‘d’ functions are known as the translation parameter and are used

to evaluate the location of the wave time series. The wave is compressed while the

condition | c | <1, is satisfied. The wave is a compressed version that is connected to

higher frequencies (multiple time cycles). On the other hand, if | c |> 1, means that

the signal width of the generated signal is larger than (t), which is directly related

to the low frequencies _ (c, d) (t). Therefore, DWT is an essential tool, utilized to

evaluate the time series based on the wavelet information similar to discrete. It is

mainly depending upon the coded sub bands and compute data in less time. The

process of using the DWT technique is a simple and easier process. The scaling

parameters like ‘c’ has been implied in terms of 2-p as well as other parameters such

22

as l2-p, where, L I for the original dataset (Ap) [31]. The DWT function can be

signified by equation (1.14).

(1.14)

1.4.3 Trend-ARIMA model

A composite method having the decomposition based approach on MA filter is

utilized to process data to remove artifacts so that the ARIMA model can easily

execute data. Some important points included in this model are given below:

Decomposition: The decomposition technique based on MA filtration is used on the

datasets. The trend component (st), and the noise component (rt) has been obtained,

and the yt = st + rt is the original Time Series Data.

ARIMA modeling: After getting the decomposition trend and noise, this model is

applied on the decomposed data. It is not necessary for this model to have same

values in both the cases i.e. trend components and the noise component.

Predictions: After applying the ARIMA model to fit on the trend component,

prediction has been obtained. It could be represented as st,pre. The obtained noise

components after utilizing this model are represented as rt,pre. Finally, the prediction

was done by adding the obtained value corresponding to trend and noise prediction.

This obtained prediction acquires higher efficiency as compared to the basic

ARIMA model [32].

pre (1.15)

1.4.4 Wavelet-ARIMA model

This model is also utilized as a composite prediction model, in which the

decomposition technique is applied based on wavelet. After that it is used for pre-

processing by which it becomes the best fit in the ARIMA model and gets more

accurate forecasting. Apart from the moving average (MA) filter, the available

wavelet filters are HAAR, db1, db2, db3, db4 and db5. At last, the obtained data is

decomposed into detailed components. More than one level of decomposition is

23

available, as shown in figure 1.16 at the initial level; the Time Series Data is filtered

by applying any one of the discussed wavelet filters [33].

Figure 1:16 Wavelet decomposition [33]

As shown in Figure 1.16, the low and high frequency components are represented by

ya1 and yd1 respectively. The main components are again split into low and high

frequency components are represented by yd1 and yd2 respectively. After doing

these two levels of decomposition, the final obtained parts are ya2, yd1 and yd2.

From these, the final approximate components are represented by ya2, yd1 and yd2.

This decomposition could be further decomposed into further levels according to the

types of Time Series Data [34]. Based on these, Wavelet based ARIMA model

some essential points are discussed below:

Decomposition: By using the wavelet-based decomposition, the dataset is

decomposed into the required number of levels based on input Time Series Data

data.

ARIMA modeling: ARIMA modeling has been applied on every decomposed

output data i.e. ya1, yd1 and yd2 so that best fit value can be obtained.

Prediction: The prediction of the original data was calculated by adding the

projections of all the fragmented components given in the following equation, and

the performance was increased relative to the basic ARIMA model.

+ (1.16)

This form of decomposition is the expanded version of the pattern dependent

ARIMA; here the details on patterns and objects are collected during the

decomposition of the usable dataset which represents a low-frequency portion of a

24

data collection, so noise data is consistent with high-frequency components. Trend

and residual data are close to the first stage of separation, and pattern decomposition

may be further applied to boost the precision of the methods [35].

1.5 CLASSIFICATION TECHNIQUES

There are vast amounts of data, which are being collected and stored in the databases

across the globe. The classification is the technique used for finding the classes of

unknown data. The classification algorithms follow three different learning

approaches, namely, supervised classification approach, unsupervised classification

approach and semi-supervised classification approach.

1.5.1 Supervised classification approach

It is the task of machine learning to infer a function through labeled training data,

which composed a training example set. In this method of learning, every example

involves a pair of an input object and the value of the required outcome. It also

analyzes the data provided for training and generates an inferred function that could

be utilized to map obtained examples. To construct a model that is used to make

predictions based on evidence in the presence of uncertainty is the main aim of

supervised machine learning. Computers learn things from the observations similar to

the patterns of data identified through adaptive algorithms [36]. If more

considerations are exposed, the predictive performance is improved by computers.

Figure 1:17 Supervised learning [36]

In this type of learning mechanism, the algorithm works by using the examples of

those class labels that are previously known to the users. The system's ideal

Known data

Known responses

Model

Model

New data Predicted Responses

(a)

(b)

25

feedback is received at the moment when the data is applied. At any time it tries to

predict this function, the outcome correlates to the known case. The system

compares their forecast with established outcomes and learn from errors, and the

weight is often handled in such a manner that the gap between necessary input and

definitive output is lowered. For regression exercises, the values used can be

conditional or numerical.

Figure 1:18 Supervised learning process [37]

In this learning strategy, mapping is done from input to an output and the correct

values of the production, i.e., known labels are provided through the supervisor [36-

37].

1.5.2 Unsupervised learning

During this process, the desired goal is unknown, therefore, to improve the behavior

of the network, the error message cannot be used, unlike the supervised learning

mechanism. The learning of the model can be performed based on observations of

the actual inputs as there is no knowledge of the perfection or incorrectness of the

response.

Figure 1:19 Unsupervised learning [38]

As shown in Figure 1.19, there is no feedback algorithm used in this learning

process. So, in this learning process, the neurons itself get the patterns and features

from the input data and find the relationship among those data points. The un-

labeled instances of data is included here and the learning is provided. Here no

Learning system

Input

Training data

Actual Output

Σ Desired Output Error

signal

+ -

Neural Network

X (Input) Y (Actual output)

26

supervisor is provided but only input data is given. The main aim of this learning

scheme is to compute the regularities among inputs and determine the data

organization. One of the examples is the density estimation of unsupervised

learning. The primary method of density estimation is clustering, which is used to

compute the clusters or grouping of inputs [38].

1.5.3 Semi-supervised classification approach

Semi-supervised learning approach utilizes labeled or annotated as well as unlabeled

data unlike supervised (in which data is all labeled) and unsupervised (where data is

all unlabeled). This algorithm provides labeled data in a small amount and unlabeled

data in a considerable amount. In supervised machine learning, it is a challenging

task to get the labeled data because it uses only marked and supervised data (label or

feature pairs). Obtain labeled information is time-consuming as well as an expensive

process since it requires more experienced human annotators. Whereas it's easy to

get the unlabeled data, but there are only few ways to use them. The semi-supervised

learning mechanism is used to solve this problem. Lots of improvement in learning

accuracy is provided by using a considerable amount of unlabeled data combining

with labeled data to obtain better classifiers. These learning mechanisms are of great

utilization as they offer high accuracy along with reduced human labor, and it can be

available in two forms inductive as well as transductive. The algorithm works only

on the labeled and unlabeled training data and cannot handle unseen data in

transductive learning [39-41].

1.6 PROBLEM STATEMENT

Punjab has been a state of capitalization and growth since last decade. A lot of

technical growth has been observed and a lot of capitalization and industrialization

has been attained. Due to increasing technical and commercial growth electricity

consumption has increased dramatically. The Punjab government has established a

unit to see the power demands and to monitor the consumption in Punjab and is

termed as “Punjab State Power Corporation Limited” (PSPCL). PSPCL has many

cities followed by tehsils to monitor. Some cities consume more electricity and it

becomes a little difficult to manage and to supply electricity to villages at the same

time. Nowadays, the proper utilization, as well as operation issue of electricity

27

consumption, becomes a major challenge in research areas. The economics of the

production of electricity has been changed due to the reason of new electricity

market strategy of electricity consumption. The pre-planning, operational strategy,

and managing skills of interconnected load systems provide a number of challenging

issues. The industry of electric power is changing rapidly along with the need for

computation of the operating strategy to satisfy the consumption of electricity which

is the most essential concern. To satisfy the utilization demand of consumers is a

great challenge. Lots of research work has been done previously to forecast the

future consumption of electricity. In this research work, ARIMA model is utilized to

predict the future consumption of electricity. To determine the minimum and

maximum utilization of electricity from the previous year’s electricity based data the

decomposition mechanism i.e., DWT is applied. The Cuckoo search optimization

along with Neural Network is used to get optimized data and classify them. To

analyze the performance of the proposed work MAP, MAPE and accuracy

parameter has been computed and compared with existing algorithms i.e. ARIMA

and ARIMA with DWT is compared with the proposed hybrid technique using

ARIMA, DWT, CS and ANN.

1.7 RESEARCH OBJECTIVES

Broad objective of this research is to develop a novel hybrid forecasting model or

algorithm to forecast the electricity consumption of Punjab, India with high

accuracy. There exist more than 50 algorithms in the Swarm Intelligence category. It

is very complicated to develop a new algorithm under this architecture and hence in

most of the standard papers, the new behavior is developed or evaluated under the

given category. In the present study, A new fitness function is designed and

developed for the fitness function of the Cuckoo Search algorithm and the

combination of the Feed Forward Back Propagation algorithm with DWT and

Cuckoo which has been never observed before. In addition to this, the way in which

the proposed architecture is applied is quite different from what has been seen in

previous models and development. The specific objectives of this research are as

follows :


28

2. To determine how discrete wavelet transform, resolve the difficulties in

ARIMA modeling.



1.8 RESEARCH GAPS

1. The accuracy of forecasting model of electric load can be improved by

hybridizing neural network with wavelet transform [116].

2. The forecasting accuracy of forecasting model of electric load can be improved by

adding more than one factor like temperature, humidity, etc. [117].

3. For forecasting the electricity consumption instead of using simple random

forests, parallel random forests can be used [118].

4. To improve the forecasting accuracy of the model weekly clustering can be

performed [119].

5. The performance of Multi-Linear Regression (MLR) for forecasting electricity

consumption can be improved by incorporation of high spatial resolution [120].

6. The accuracy of electric load forecasting model can be increased by developing

homogeneous ensemble model based on support vector regression can be developed

and evaluating them with different [121].

7. To improve the accuracy of the ARIMA model, it can be combined with wavelet

transform [122].

29

CHAPTER 2: PRESENT STATE OF ART

Electric power performs a critical role in economic and social enhancement along

with the improvement of the community and thus helps to improve the living

standards of persons. Energy consumption is rising in all areas of the world. In

today's era, urban areas make consumption of 67% of the overall world's energy

consumption. So, based on urban complexities and future performance, it needs

accurate information based on the existing patterns of power consumption. This

chapter deals with the relevant works on the strategies of electricity consumption

and load prediction [42].

2.1 HISTORY OF ELECTRICITY SUPPLY

The first power station was build by Thomas Edison in the city of New York that

was operated in 1882. In India the first hydroelectric installation was installed near a

tea estate at sidrapong in Darjeeling Municipality in 1897. Nowadays, the electrical

utility company named Power Grid Corporation of India Limited (PGCIL) is

responsible for production and distribution of almost half of the total amount of

produced energy from its transmission network on different levels of voltage. As per

the operating and planning purpose of the Indian power system, the power system is

characterized among five regional power grids, i.e., south (S), north (N), east (E),

west (W), and northeast and south areas [43].

In 2006, the remaining regional power grids were interrelated, apart from the

southern power grid. At the end of 2013, the southern region has been interlinked to

make the central power grid in synchronous mode, which got the concept of one-

country one-power grid one-frequency. The term CEA (Central Electricity

Authority) of India is a national contractual body which often recommends the

government of India on issues relating to the national electricity policy which

establishes short-term and long-term plans and sometimes even develops renewable

energy for the development of power devices [44].

30

2.2 PUNJAB STATE POWER CORPORATION LIMITED

Punjab State Power Corporation Limited (PSPCL) is the electricity generating and

distributing company of the Government of Punjab state in India. PSPCL was

incorporated as company in 2010 and was given the responsibility of operating and

maintenance of State's own generating projects and distribution system. The

business of Generation of power of erstwhile PSEB was transferred to PSPCL.

PSPCL has been developing the models for electricity prediction since its inception

and there are various factors that impact the consumption of electricity [45]. There

are numerous studies that highlights the electricity consumption models and needs of

electricity consumption. Few of the studies are reviewed below:

2.3 OVERVIEW ON LOAD FORECASTING

Numerous researchers have been working to develope the accurate prediction

models for consumption. Some of them are discussed as under:

Willis, H., & Aanstoos, J. (1979) introduced a variety of forecasting methods that

have been formed after continuous development and improvement, and

subsequently, their prediction accuracy has also improved. Spatial load forecast

work was concerned during the 1970s, and many methods were suggested, but the

value of land use types to increase predictability was not considered. This method

has been improved through the elements of fuzzy logic; various strategies are

available to forecast the consumption of electricity such as fuzzy multi-objective

decision making, cloud theory, and presently through mathematical approaches [46].

Fan, S., & Hyndman, R. J. (2010) presented a comprehensive state-of-the-art

exists for the prediction of electricity consumption, and unique perspectives have

been discovered. Statistical analysis is concerned with the estimation of system

demand and peak demand hourly, daily, weekly, and annual basis. Electricity

production is vulnerable to several uncertainties, including conditions of weather

(temperature, rainfall, humidity, and so on) growing population, technology,

economic conditions, and specific utilization irregularities [47].

Variety of transmission demand modeling approaches are available through

regression load forecasting to land-use modeling load forecasting, incorporating

31

good practices such as INSITE (long-term spatial load forecasting tool), and the

scheme of advanced distribution load forecasting namely Load SEER (Load Spatial

Electric Expansion and Risk) to prepare and document forecasting of consumption

and produce support in the analysis of managerial, public, and regulatory [48].

2.3.1 Type of Load Forecasting Technique

Topalli, A. K., & Erkmen, I. (2003) proposed hybrid learning neural networks for

short term load forecasting (STLF) is varied in between the hour to a week. This

work is done to predict the total consumption of electricity for the next single day in

Turkey. Initiated with weights randomly and obtained surprising prediction that was

not acceptable for the real-time operation. So by using available previous data, real

load data collected by the Turkish Electricity Authority will be used online, and the

model has been designed to do online forecasting.

Additionally, a method for clustering input data has been presented based on hourly

electrical consumption. Many alternate models have been created to have an

understanding of the model's performance. After clustering in the proposed model,

the average errors have been minimized to 9.5%, and in the case of hybrid failure

was 2.4%. Without clustering using off-line learning on the same datasets, the

obtained average error has 10.6%, so this proposed online forecasting have better

outcomes in contrast to the previous off-line, without clustering datasets [49].

Ghiassi et al. (2006) presented the development of a medium-term electrical load

forecasting (MTLF) dynamic artificial neural network model (DAN2). Accurate

MTLF offers details to utilities to help prepare power generation extension (or

purchase), schedule maintenance operations, conduct system improvements, discuss

potential contracts, and build cost-effective fuel procurement strategies. Introduced

an annual method that predicts future electrical requirements using previous monthly

system loads. Authors have also depicted that weather data inclusion enhances the

accuracy of load forecasting. Nevertheless, these models need precise weather

forecasts that are often difficult to acquire. Most of the utilized models have tested

through the Taiwan Power Company's actual device load information. All annual

and seasonal models yield mean absolute percent error (MAPE) values below 1%,

demonstrating the effectiveness of DAN2 in medium-term load forecasting to get the

32

outcomes, they have equated the results with multiple linear regressions (MLR),

ARIMA and a conventional neural network model [50].

Carpinteiro et al. (2007) used long terms forecasting of the load is defined as the

prediction of load behavior for the future. As per the period of time, it may be

further categorized into short, medium, and long-term. Short-term predictions

typically vary from an hour to a week, medium-term predictions normally vary from

a week to a year and long-term forecasts surpass a year.Electricity demand

forecasting has a significant short-term role as well as long-term forecasts for

planning future electricity generation forecasts. Long-term predictions can be

utilized for system planning, latest generation capacity building schedules, and the

purchase of producing units [51].

Amjady, N., & Keynia, F. (2008) proposed the Mid-Term Load Forecasting

(MTLF) model this work has focused on theprediction of daily maximum

consumption of electricity load for a month ahead of several kinds of MTLF. In this

load, forecasting has several usages such as operational schedules, medium-term

hydrothermal coordination, suitable assessments, management of limited energy

units, pre-contracting, and development of efficient fuel procurement strategies.A

nonlinear, volatile, and non-stationary signal is the daily peak load. Additionally,

this problem is usually further complicated by the lack of sufficient data. To

overcome this issue here presented, a new scheme composed of a data analysis

structure, prediction mechanism along with ANN, and an evolutionary based

optimization approach has been used. To observe the outcome has compared these

mechanisms with other MTLF methods showing its ability to overcome the concern

of load forecasting [52].

Soares, L. J., & Medeiros, M. C. (2008) deployed an improved version Seasonal

Integrated ARIMA (SARIMA) model for short term load prediction (hourly) to

forecasting the region located in the southeast of Brazil that is enclosed by the

electric utility. Per day various models have been built for every hour, based on the

decomposition of the regular series of every hour among two given two components.

The behavior of the first element is strictly probabilistic and more to do with

patterns, variability, and the impact of essential days. And the nature of the second

33

component is algorithmic, following linear modeling of autoregression (AR), here

the next step, non-linear options have been taken into account. The multi-step

forecasting performance of the proposed approach is contrasted with the various

existing system, and the outcomes depict that the solution of this work is useful in

the forecasting of electricity charges in the thermal conditions [53].

Pedregal, D. J., & Trapero, J. R. (2010) presented a multipurpose approach to

forecast load in an optimal way specifically at mid-term horizon hourly rate. This

scheme is an extended method for the previously defined short-term scheme, which

is again used to forecasting load and prices based on components that are not

observed. This approach involves estimating different models at different rates,

mainly monthly and hourly, for the same data sampled.The growing model

integrates the correct data characteristics due to its corresponding interval of

sampling, and both types of predictions are combined into a single forecast through

effective time accumulation strategies that participate in a computational complexity

structure being naturally implemented [54].

Darbellay, G. A., & Slama, M. (2000) compared different models that are

applicable for short term forecasting that has mainly solved a problem that is

previously faced by electricity suppliers. To overcome this issue, have used novel

schemes that have to solve the problem of non-linearity if it is present in current

work. Initially, they have introduced the non-linear measure of mathematical

dependencies. Then in the next authorshave observed the autocorrelation function of

Czech electric consumption that is linear and non-linear in nature. After that have

made a comparison of forecasting accuracy of the non-linear model, namely

Artificial Neural Network (ANN), along with the linear model, i.e., ARMA. It has

been analyzed that short-term forecasting evaluation of the Czech electric load has

been considered as a linear problem after done the comparison analysis [55].

El-Telbany, M., & El-Karmi, F. (2008) presented the outcome of forecasting

connected with a three-layer feed-forwarded neural network to predict the daily

consumption of electricity through considered several factors. These considerations

are; data from past production, time influences, and data from temperature. This

neural network (NN) research was conducted using particle swarm optimization

34

(PSO) and backpropagation (BP).For which PSO is a strategy of novel requirement

focused on the collective psychological model. The outcome of this trained neural

network is compared with the method of neural backpropagation (BPNN) and

autoregressive moving average (ARMA).In terms of the randomness of the neural

network trained by the BP algorithm on the comparable test results, the efficiency of

the PSO algorithm is improved compared to the BP algorithm.Particle swarm

optimization is a gradient-based algorithm that usually involves iterations of specific

functions to get an optimal outcome as opposed to the BP. And because of its

usefulness in looking for vast spaces and the potential to carry out a global quest for

the best forecasting model, it is a successful process [56].

Kandil et al. (2006) proposed an approach for performing short-term load

forecasting based on Artificial Neural Networks (ANNs). And have examined the

abilities of this modelwithout the use of load history input variable in the prediction

of electricity consumption, often various weather variables have been used

previously among them the only temperature has been considered here. And it has

also analyzed that there is no negative impact without considering the other

variables such as sky condition and wind velocity. The variables that are used

mostly named as an hour and day indicators, weather-related inputs, and previous

consumption. For training and testing, the weekly data has been taken for one

month. Before the generalization of data, it has to train these data. Here generalized

delta rule (GDR), also known as error backpropagation algorithm, has been utilized

to prepare the layered structure of ANN. In this proposed work, the enhanced

outcomes have been obtained through considering some points mainly such as; used

advanced kinds of ANN, the architecture of ANN is better, selectively better input

variables, and the selected training set [57].

Xiao et al. (2009) introduced a mechanism of rough set backpropagation (RSBP)

neural network (NN) in extensive Short-term load forecasting (STLF) with different

non-linear parameters to improve predictive accuracy.The STLF plays an important

role to manage electric consumption of any state having an insufficient amount of

electricity according to the requirement have been increased. The effect of noise

information and low interdependence data on BP is avoided by attribute reduction

based on parameter accuracy with the rough collection, thereby reducing the time

35

required for learning. They analyzed RSBP's efficiency by comparing its forecasts

with those of the BP network through the application of load time series from a

practical power system [58].

Catalão et al. (2007) presented an approach to predict ST electricity prices using a

neural network approach. For the last couple of years, energy supplies have

considered a public utility, and any cost prediction that has been done appeared to be

long-term, about possible fuel prices and technological upgrades. These days, due to

the increasing consumption of electricity, a short-term forecast becomes increasingly

important. Therefore, in this current environment, manufacturers and investors have

expected to obtain the procurement methods from the electricity market throughout

the shorter term. Precise forecasting resources are required for producers to increase

their income, disclose possible losses in relation to future demand volatility, and

optimize their services. The novel algorithm, namely Levenberg Marquard, utilized

a three-layer neural feed-forward network to predict electricity prices next week.

The authors have examined the performance of the demand forecasting with the

proposed neural network (NN) method [59].

Goude et al. (2013) have presented a semi-parametric solution based on the theory

of standardized regression processes to model consumption of electricity around

higher than 2200 French distribution system stations, both within the short and

medium-term. The association between the load and predictor variables has

determined by these simplified differential models such as temperatures, calendar

variables, and so on. This technique has implemented on the French grid, including

enhanced results. Authors have illustrated the fact that is required to estimate

functions described the demand-to-drive relationships have been interpretable and

that forecasting of temperature is essential. The obtained potential of this particular

scheme is to gather different consumption series (approximately 2000) analyzed

instantaneously on the French grid, without any human intervention [60].

Minaye, E., & Matewose, M. (2013) presented a practical approach that can be

used as a reference for building models of Jimma City Electric Power Load.

Trending statistical analysis methodology is involved in the work of load

characteristics and predictive precision, namely; linear regression, compound growth

36

model, and quadratic regression. Specific monthly and annual demand analysis has

been used as a particular work from the transformer of the Jimma distribution

system. By applying the optimized value of the coefficients of regression, and the

mean absolute percentage error, the growth of the compound method has been

utilized in the demand prediction for the upcoming five years. In this distribution of

city to predict, some specific techniques are used; linear trend, compound growth,

and quadratic regression methods. The performance has been analyzed based on two

parameters named as best rank correlation coefficient (��) and Mean absolute

percentage error (MAPE) [61].

2.4 CLASSIFICATION OF LOAD FORECASTING

Willis, H. L., & Romero, J. (2007) presented the definition of spatial load

forecasting approaches has evolved through the years when statistical instruments

and strategies from other fields of expertise were incorporated in their development

efforts by the electricity delivery utilities.The methods are grouped into three

categories to show the differences in the databases considered by different

methodologies, taking into account the classification. The given three methods are

non-analytic, trending, and simulation, as shown in figure 2.1. In the non-analytic

forecast, the data without considering the historical data of the past days.

Spatial Load Forecasting Methods

Non-Analytic Analytic

Manual Computerised Trending Multivariate

Single Area Multiple Area Landuse Extrapolation

Regression Decomposition Clustered Load TransferVacant Area

Figure 2:1 Spatial electric load forecasting methods

37

To predict future outcomes, the other two approaches utilize historical data. The

second approach (trending) is employed to analyze, and historical data have been

used to evaluate predictions [62].

2.4.1 Trending methods

The trending methods of forecasting in all sub-areas through the

extrapolationscheme to predict the peak load in every sub-area through the available

data. In general, the input data for these approaches take into account each small

area's historical demand and utilized different methods for approximate the loads in

vacant spaces. The server may also comprise information related to whether to

generate the electricity consumption curve. These strategies have appealed to

electrical distributors since they used limited databases to forecast transmission

feeders; however, because they do not understand inter-area relationships, these

techniques could provide little information for utility substation expansion planning

[63].

Salvó, G., & Piacquadio, M. N. (2017) presented a novel approach for estimating

the spatial increase in demand for electricity. The new focus adds value to but does

not substitute the traditional tools used by utilities. The consumer demand analysis

was split into two multi-fractal as a result of the approach applied. That being

surrounded by a suburban area of lower demand), with a border between them,

demonstrating that the seemingly random distribution of order has an internal

structure that can only be seen by multi-fractal research. The results obtained show

properties named stability and constant frontier dimensionality etc. Provided

measures in natural ways compared to multifractal proportions. The method could

also be strengthened from a geographical and demographic point of view by

evaluating urban-models, and land-use models have adjusted for the area on which

the utilities worked [64].

2.5 LOAD FORECASTING METHODS

Various forecasting methods are available in the research area included with varying

degrees of having been developed and discussed for prediction of power

consumption comprises of multiple linear regression, non-linear regression model,

38

and multivariate regression model. The types of forecasting have been discussed, the

approaches used to forecast medium and long term predictions are; (a) end-use

model, (b) econometric models, and (c) statistical model-based learning. The models

used for short term forecasting are; (a) similar day approach, (b) regression methods,

and (c) times series based. The artificial neural network included with the number of

schemes like backpropagation neural network (BPNN), particle swarm optimization

(PSO) dynamic artificial neural network, Elman artificial neural network, and Jordan

recurrent neural network have been applied. Simple autoregressive (AR),

autoregressive moving average (ARMA) and autoregressive integrated moving

average (ARIMA) models have been presented earlier. The present work focuses on

the methods that are best suitable for forecasting load consumption, i.e., ARIMA

and ANN. As per the presence of seasonal elements in the Time Series Data, an

ARIMA model has been utilized for prediction [65].

Al-Hamadi, H. M. (2011) have presented load forecasting of long-term prices using

fuzzy logic as a classification approach. A fuzzy model of linear regression has been

developed using factors affecting loads, i.e. loads from previous years, population,

and annual growth factors. In the fuzzy-based approach presented, the problem of

linear optimization has been developed to minimize the distribution of parameters

under fuzzy regression. Annual increases were calculated using cubic polynomials

for each of the long-term forecast factors. The results of this study showed that the

absolute error of the projected average daily load did not exceed 3.68% of the actual

load during the whole year. According to the results obtained, the proposed model

and forecasting technique used to gain a significant advantage over existing models

in order to reduce the average absolute error between the actual loads predicted over

a given period of time [66].

AlRashidi, M. R., & El-Naggar, K. M. (2010) introduced a novel yearly peak load

prediction approach in electrical power systems. To reduce the error associated with

the estimated parameters of the system, a particle swarm optimization (PSO) has

been proposed for long term forecasting. This work has been done through the latest

recorded data from Kuwaiti and Egyptian networks. The research has been

performed the task based on some criteria such as : model parameters estimation of

(a) Egyptian system (b) Kuwaiti network and (c) maximum load demand forecast of

39

Kuwaiti network. The average error generated by the designed model has been

analyzed and used to predict the load. Predictions using the method of PSO are

compared to those obtained using the technique of LES, and PSO predicts data with

better accuracy compared to the LES approach [67].

Al-Saba, T., & El-Amin, I. (1999) presented an Artificial Neural Network (ANN)

with a multilayer perceptron for long term forecasts in requirements of energy in

electricity service. In the long term, forecasting various models has involved time

series models named AR, ARMA, and ARIMA compared with the performance of

ANN. In this particular work, ANN composed single neurons in the output layer in

all cases; often, in the input layer, various neurons have involved. This varies the

number of neurons depends on the model of neural network (NN). Here the ANN

used the architecture of the backpropagation (BP) algorithmhas some features by

which the implementation through ANN becomes reliable. These are; the training of

BPhas designed to reduce the mean squared error (MSE) in between the required

output and values that are provided during training [68].

Chatfield, C. (2001) performed forecasting by using a nonlinear system's event

opens up the opportunity for constructive preparation and effectively accomplishing

the business objectives set. However, problem-solving and prediction of coming

events of nonlinear processes is a difficult task due to noise and non-stationary. The

time-series data is therefore chosen at regular intervals, which is a collection of

findings with a numerical attribute of an individual's characteristics. Thus, even for a

domain analyst, it is entirely unfeasible to understand the historical structure from

the past to make a proper decision. Like, it is difficult for a stock market expert to

recognize the rise and fall of a stock price reliably and correctly. The time-series or

historical data are therefore utilized to build a system and to evaluate the non-linear

system is still searches [69].

2.5.1 Literature review of studies using ARIMA or its Hybrid models for

forcasting

This model is becoming very famous, since its ability to deal with several kinds of

data like; stationary and non-stationary. Whereas the ARIMA model is pre-assumed

the relationship between historical and predicted data, and it works suitably for

40

linear time-series data but not provides relevant results in non-linear data. So the

single model is not sufficient to produce a better result in the forecasting of Time

Series Data; due to this reason, we applied here ARIMA with ANN hybrid algorithm

[70].

Dong et al. (2016) proposed a hybrid approach to forecasting future residential

consumption of energy that have done in two steps; first, predict non-AC

consumption of electricity by using past data since internal heat gain is highly

correlated with non-AC electricity consumption, heat convection and conduction can

be directly calculated from the non-AC prediction.Secondly, have expected weather

is inputted into the thermal network differential-algebraic equations (DAEs) in

combination with the internal heat gain forecast to simulate zone temperature. Then,

the temperature change in the measured area is modified to an AC regression model

with a set point plan. The AC cooling power consumption is also expected

afterward. Total electricity consumption by summarizing the AC and non-AC

forecasts.This work has been verified by using one month of data from four

residential buildings. Five other approaches ANN, SVM, LSSVM, GPM, and GMM,

have been compared based on the same inputs, and this hybrid approach acquired

efficient outcomes [71].

David et al. (2016) evaluated the quality in econometrics of a widely used

combination of two linear models, i.e., ARMA and GARCH (Generalized Auto-

Regressive Conditional Heteroskedasticity model) to provide probabilistic solar

irradiance forecasts. However, a recursive approximation of the model parameters

has been developed to provide a structure that can be easily implemented in an

operational context. Theproposed model, like other models based on machine

learning techniques, can reliably perform point predictions using only solar radiation

records. In contrast, it is more convenient to build a recursive ARMA-GARCH

model and provides additional information about the uncertainty of forecasts. The

ARMA-GARCH approach is an effective combination of models used with

confidence intervals to prepare very short-term forecasts of solar radiation. The

recursive ARMA model provides a simple and practical method for predicting

points. Using only the true value of solar radiation applied in this study, this

approach is superior to other statistical models [72].

41

Alsharif et al. (2019) proposed a seasonal- ARIMA model in Seoul, South Korea, to

forecast the solar radiation regularly, i.e., daily and monthly. The data has been

obtained by using the Korean Meteorological Administration from 1981 to 2017 of

37 years. The performance of the designed system has been tested based on the

fully and partially autocorrelation function for residuals, After that, root mean

square approach has been applied and the obtained results are compared to the

Monte Carlo simulations. Here the ARIMA (1,1,2) model is utilized to represent the

solar radiation daily. To indicate the monthly solar radiation used, ARIMA (4,1,1)

having 12 lags includes both AR and MA parts of the model and got the 176 – 377

Wh/m solar radiation monthly [73].

Al-Musaylh et al. (2018) focused on the data-driven technology for the short-term,

i.e., hourly forecasting, G-data has been adopted through the half, one and 24 hours

horizons of predicting. This proposed algorithm is based on some models, i.e.,

Multivariate Adaptive Regression Spline (MARS), SVM (Support vector machine),

and ARIMA This work is mainly focused in Queensland (Australia) in which

increases the load demand to consume for end-user. In short term forecasting, i.e.,

half and one hour horizons, the MARS model performs better as compared to the

SVM and ARIMA models along with the largest WI (Willmott's Index)and shortest

MAE. The accuracy of this work has been measured based on such parameters, i.e.,

RMSE, MAE, and relative RMSE [74].

2.5.2 ANN and ARIMA

The prediction for Time Series Data is a trending research area due to the reason it

plays a vital role in forecasting and deciding in various experimental areas. In this

work, the task is to enhance forecasting efficiency. From previous daysthese two

algorithms, i.e., ARIMA with ANN, have been extensively applied for the prediction

of Time Series Data [75].

Bedi, J., & Toshniwal, D. (2019) proposed a deep learning-based forecasting model

for the prediction of electricity by resolving long-term historical dependencies.

Initially, the cluster analysis is carried out on all-month data on electricity

consumption to produce segmented data based on the season. Subsequently, analysis

of the load pattern offers a more in-depth insight into the metadata that falls into

42

each cluster. Additionally, the comparison of performance has been made between

the proposed approach, the Recurrent Neural Network, SVM, and regression models

of the ANN. It has to be concluded that the proposed approach outperforms in

contrast to the SVM, ANN, and Recurrent Neural Network with the regression

modeland can be used to predict the demand for electricity effectively. This

presented model is fully scalable and promotes experiential learning means that the

moving window-based MIMO approach integrates statistical data with new findings

of real-time production to predict electricity consumption accurately [76].

Khashei, M., & Bijari, M. (2010) presented a new hybrid approach of ANN by

utilizing an ARIMA model to get a higher efficient scheme as compare to ANN. The

working of this particular scheme is based on the Box- Jenkins of linear modeling;

here, time-series is regarded as a non-linear function of various previous

observations and random errors. So, in the first stage, ARIMA is utilized to predict

the consumed required data after that decide a model to the ANN is then used to

define a framework to capture the essential process of data generation and to

forecast the future using pre-processed data. This proposed model has produced

more efficient results than the hybrid model of Zhang, and both ARIMA and ANN

models have been used separately over three different time intervals, i.e., 1 month, 6

months, and 12 months with both error measurements [77].

Babu, C. N., & Reddy, B. E. (2014) developed a novel hybrid ARIMA – ANN

model for the Time Series Data prediction. Various hybrids ARIMA – ANN models

have discussed in the state-of-the-art that applied an ARIMA model in Time Series

Data, taken care of the error between both the actual and the ARIMA data as a non-

linear element and design them through an ANN in various ways. While these

models provide predictions with greater accuracy than the individual models, there

is room for further improvement in efficiency when taking into account the

complexity of the given time series before implementing the models. The essence of

volatility was discussed in the work outlined in this paper using a moving average

filter, and then an ARIMA and an ANN model are applied accordingly. The

recommended hybrid ARIMA – ANN model has been implemented through a

hypothetical dataset and empirical data sets like sunspot data, data of electricity

price, and the stock market. The obtained outcome by using these datasets indicated

43

that the presented hybrid method is having higher predictive reliability for both one-

step and multi-step forecasts [78].

Khandelwal et al. (2015) presented the benefits of DWT in enhancing precise time

series forecasting. They have also suggested a novel forecasting technique by

separating a Time Series Data set through DWT into two components, such aslinear

and nonlinear. DWT is initially used to break down the time series in-sample

training dataset into two parts linear, i.e., detailed and non-linear, i.e., approximate.

After that, the time series models, namely; ARIMA and ANN, have utilized in the

recognition and prediction of these given two components, such as detailed and

approximate components reconstructed separately, respectively. In this way, the

proposed approach makes usage of DWT, ARIMA, and ANN's unique strengths to

improve predictability. This proposed work has been analyzed on four real-world

time series, and their predicted values have compared to the ARIMA, ANN, and the

hybrid models of Zhang. The obtained outcome indicated that the approach

proposed achieves the best predictability in each sequence [79].

Lee, W. J., & Hong, J. (2015) established a flexible and fluid time series hybrid

system for mid-term load prediction and then tested the quality of such a model

through implementing it to the real load data of the metropolitan area of Seoul,

South Korea with a standard dynamic model, the Koyck model, and the ARIMA

model. A quadratic resistance to air temperature was introduced by the proposed

hybrid model and the Koyck model. This hybrid model provided higher predictions

than the models Koyck and ARIMA. This presented hybrid model would

significantly reduce the forecasting error and its periodic variance and may be a

powerful tool for mid-term forecasting with measured air temperature data. The

main benefits of this hybrid model is that (a) it removes the need for statistical

methods of non-wetter determinants and (b) it could be effectively expanded to a

complex model through incorporating quantitative analysis of other independent

factors [80].

Rana, M., & Koprinska, I. (2016) presented advanced wavelet neural networks

(AWNN) to forecast short term load forecasting (VSTLF) predicted the

consumption of electricity from minutes to hours. Have investigated the ability of

44

decomposition and for feature selection and prediction used non-linear algorithms

using AWNN to obtain the more efficient forecasting model of VSTLE. To get a set

optimal frequency element that is used in the representation of data and separately

prediction of each component, a wavelet decomposition technique has been applied.

AWNN's precision was tested using two years of Australian electric charge data

estimated every five minutes and two years of Spanish data evaluated every 60

minutes [81].

Dudek, G (2016) implemented a univariate model for short-term load forecasting

based on linear regression and regular load time series trends. The amount of

determinants has been reduced to one that helps in the representation of regression

function through primary component regression or partial least-square regression.

Here the models approximate two factors, i.e., mean and variance by the

leastsquared method. Here significant benefits as compared to specific STLF models

based on ARIMA, exponential smoothing, neural and neuro-fuzzy networks, or

SVM, thatincluded dozens or hundreds of parameters, and their estimation involves

advanced methods of optimization. Specific STLF models used a similar pattern-

based approach and economic modeling previously: MLP and N–WE. Although

these models are nonlinear, they have better extrapolation properties for the

proposed linear models [82].

Barak, S., & Sadegh, S. S. (2016) estimated annual energy usage in Iran is based

on three ARIMA – ANFIS system patterns. These are (a) ARIMA model having

posses with four input data features along with six Adaptive Neuro-Fuzzy Inference

System (ANIFS) extracted features. The data is being clustered using C mean and

then used to train the system, which is further being used for forecasting. In the

second sequence, ARIMA's prediction is considered as input variables for ANFIS

prediction in addition to 4 input features. Therefore, in energy prediction with 6

different structures of ANFIS, four described inputs are used in addition to the

output of ARIMA. In the third approach, the second pattern is implemented by using

the information diversification model AdaBoost (Adaptive Boosting), and an

innovative ensemble technique is introduced due to data deficiency [83].

45

2.5.3 Wavelet Decomposition based ARIMA and ANN

Wavelet transformation is a decomposition-based approach utilized to split data into

low and high frequency components. Here, for the decomposition of time series

electricity data has been decomposed using WT method. The forecasting accuracy of

both ARIMA and ANN has enhanced by using wavelet transform in many areas

such as to forecast electrical price, stock price, consumption of short-term load, etc

[84].

Sun et al. (2019) presented a Seasonal and trend decomposition approach to

designed an electricity prediction model for monthly prediction. This method

produces the combined effect of STL and ARIMA models, only one model and

performance of this integrated model is compared with three models. These three

models are: ARIMA, SARIMA, and the third model is the product of these three

components (i) trend, (ii) seasonal, and (iii) random factors. The first use of the STL

model according to the electrical properties is the individualization of the electricity

consumption time series. It affects the seasonal, pattern, and random components

factorization of monthly electricity consumption. The adjustment of the

characteristics of the three components is considered overtime. Ultimately, in

reconfiguring the monthly electricity consumption forecast, the correct model is

selected to estimate the components [85].

Nury et al. (2017) presented an alternative temperature prediction approach by

combining the wavelet technique with the ARIMA model and the ANN on the peak

and lowest monthly temperature data. The model configuration and verification

efficiency are systematically analyzed, and the relative output is evaluated based on

the predictive potential of out-of-sample forecasts.The ability to predict and

reliability by using in the sample and out sample data, i.e., RMSE, has

been evaluated through the percentage of bias (PBIAS) and the consensus

coefficient for these two models. To train the ANN model here, the Levenberg–

Marquardt (LM) algorithm in the platform of MATLAB, because this algorithm is

efficient, fast, and accurate [86].

Pannakkong, W., & Huynh, V. N. (2017) developed the recent integrated model of

the ARIMA and ANN along with DWT.This analysis used DWT to decompose time

46

series to provide detail and approximation. Both are studied based on Zhang's

integrated scheme involving both the ARIMA and ANN paradigm for non-linear

extraction together with linear components. These two components must be

combined to get final production. This proposed model was tested on three datasets,

such as; Canadian lynx, British pound or, the US dollar's exchange rate, and the

sunspot of Wolves. Ultimately, the results obtained indicate that the made model

offered better accuracy as a comparison to the hybrid model of the ANN, ARIMA,

and Zhang. Quality was calculated using three validated time-series datasets, and the

parameters used are MSE, MAE, and MAPE [87].

47

CHAPTER 3: METHODOLOGY

This chapter focuses on the methodology proposed for forecasting the electricity

consumption based on hybridization of ARIMA, DWT, Cuckoo search and ANN.

The chapter will discuss the proposed methodology in detail.

3.1 METHODOLOGY OF THE PROPOSED HYBRID FORECASTING

MODEL

In this research work, the main objective is to estimate the consumption of

electricity in the future in Punjab state. The required datasets for the research have

been taken from PSPCL, India. The step followed to achieve the objectives of the

proposed work are as follows:

Step 1: The present study utilizes original data from Punjab State Power

Corporation Limited (PSPCL).

Step 2: Discrete Wavelet Transform (DWT) which is a discretized continuous

wavelet transform (CWT) that breaks the Time Series Data into an integer number

of data samples is applied. The obtained results are similar in number as of original

data before decomposition by which it produces more accurate outcomes.

Step 3: Further decomposition of DWT decomposes the complete range of

electricity consumption into four distinct categories with different ranges

i.e., (LL), (LH), (HH), (LH).

These decomposed components are helpful in knowing the highest as well as lowest

electricity consumption in Punjab state as per the datasets of PSPCL. The

decomposition is done by using the Low Pass Filter (LPF) and High Pass Filter

(HPF) as explained below:

Low Pass Filter (LPF): is used to decompose the entire range of frequencies

into a lower frequency. LPF is utilized to attenuate signals having frequency

above than the cut-off frequency by allowing lower frequencies to pass

through the filter.

48

High Pass Filter (HPF): This filter attenuates signals with a frequency

lower than the cut-off frequency by allowing higher frequencies to pass

through the filter.

Step 4: ARIMA model is applied individually on obtained decomposed frequencies

i.e. low as well as high frequency, to get the Time Series Data. The frequencies

obtained are in four categories i.e. LL, LH, HL and HH. Therefore we apply

ARIMA model on all four frequencies. This model is depicted in figure 3.1.

ARIMA1 model is applied for the first category (LL) of electricity consumption,

ARIMA2 model is applied to values corresponding to second decomposition (LH).

The third model i.e. ARIMA3 is applied to the third decomposed category (HL), and

finally, fourth model ARIMA4 is applied to the last decomposed component (HH).

Figure 3.3:1 Proposed Work

Hybrid ARIMA Model

Original Data from PSPCL

Apply DWT

LL LH HL HH

ARIMA1 ARIMA2 ARIMA 3 ARIMA 4

Apply I DWT

Optimize data using CS

Model Forecasting using ANN

49

Step 5: After applying ARIMA models individually results are combined, and

Moving Average and Autoregressive analysis is performed by using integrated (I)

element of ARIMA to obtained Time Series Data.

Step 6: After that, Inverse (I) DWT is applied to the outcomes of these distinct four

ARIMA models to combine these. It builds a well-recognized record that is helpful

in providing the training of the proposed model.

Step 7: At the classification phase, the ANN classifier requires higher uniqueness as

well as accurate data for their training. To get the best record of the previous

month’s electricity consumption, CS optimization strategy with novel fitness

function, is applied.

Step 8: If the uniqueness of the data record is higher, then the training of the system

becomes easier and fast. So, this unique data record obtained through CS

optimization is transferred to the classification phase as an input of the training set to

ANN. In the end, ANN is used to train the proposed model by which it becomes

helpful to forecast the consumption for upcoming month on the basis of previous

time series values of electricity consumption.

3.2 PROGRAMMING LANGUAGE

The decision about which programming language deploy was between two classes,

i.e. First segment was that of programming languages like C++, Matlab and R that

are more established and generally utilized and Second segment was of new Java

based programming languages like WEKA and RapidMiner. In spite of the fact that

MATLAB is not as fast as C++, and the open source quality, of other up-and-comer

programming languages, it was the most ideal decision. It is the most widely used

tool and provides a lot of potential possibilities and promptly accessible code. The

other advantage is the best harmony between the language complexity and easiness

to use since the user becomes capable of seeing the details on a low level and there

is no need to invest a lot of energy in developing the most frequently used code

snippets and data structures. Also, parallelization and the conservativeness of the

code as far as vectorization is concerned helps in decreasing the runtime [88].

50

CHAPTER 4: PROPOSED WORK

This chapter provides an overview of the forecasting algorithms used to develop the

novel hybrid forecasting model. The research activities involved to achieve each

objective are thoroughly discussed. Section 4.1 discusses the forecasting algorithms

used to develop the proposed hybrid model. All the succeeding sections including

Section 4.2 onwards are discussing the research activities involved to achieve

objective 1 to 4 in detail.

4.1 ALGORITHMS USED TO DEVELOP THE PROPOSED HYBRID

FORECASTING MODEL

This section briefly discusses the algorithms used to develope the proposed hybrid

forecasting model. Time series forecasting has been the trending area of research

since past few years. There is a wide range of time-series forecasting algorithms

available and numerous researchers have been trying to find the means to increase

the prediction accuracy by developing new and hybrid algorithms. In present

research a novel hybrid forecasting model has been developed using well-known

algorithms of different classes i.e. Auto Regressive Integrated Moving Average

(ARIMA), Artificial Neural Network (ANN), Discrete Wavelet Transform (DWT)

and Cuckoo Search (CS) [89]. In the following section all these algorithms are

explained in detail.

4.1.1 ARIMA model

The ARIMA model is a scientific tool utilized to resolve the prediction problem in

forecasting of time series. Implementation of this model is easy and flexible as it

needs only past observation of the required variables. This model is Linear modeling

scheme and is a combination of three components, such as AR, I, and MA which are

briefly described below:

Auto Regression (AR) is defined as the model that uses the observer-

dependent relationship and a few lagged observation numbers.

51

Integrated (I) to compute the differencing of raw observations, for example,

subtraction of representation from the previous view to make the stationary

time series.

Moving Average (MA) utilizes the dependency among observation and a

residual error of a moving average model that is applicable for lagged

observations.

ARIMA model was firstly deployed in 1976 by Box and Jenkin by using MATLAB

R2012a; Model was proved useful for the preparation of data and computation of

autocorrelation function(ACF) and partial autocorrelation function (PACF). The

above three main components of the ARIMA model are clearly explained in terms of

the parameter of this model. A classical notation is used for ARIMA is

ARIMA (p, d, q) in which the parameters are assigned with integral values, and the

notation used are illustrated as follows:

p: represents the observed values for lag involved in the prediction model.

d: denotes the amount of time are obtained after the difference between raw

views, also known as“degree of variation”.

q: is the moving window dimensions and is known as “degree of moving

average”.

The ARIMA model was designed by combining AR and MA models, which are

being performed by Jenkins. Mathematically, it can be written by equation 4.1.

zt=K+ϑ1 zt-1+ϑ2 zt-2+⋯+ϑm zt-m +nt (4.1)

Where ……. intercept term at the first, second, and the last position

respectively.

k Constant

White Gaussian Noise

The analyzed values using equation (4.1) are often used for the prediction of the next

data. However, MA model can be mathematically presented by equation (4.2).

(4.2)

52

MA represents the regression, which represents the lagged error value. For the data

decomposition, MA filter helps and the equation for fth order of MA filter is written

by equation (4.3).

(4.3)

Where, f=2i+1. As a result, time series (t) is obtained with an average period of ‘i’

to obtain the trend period. The obtained observation values appear to be close to the

random exclusion value. However, M2 is a harmful component obtained in the form

of . In equation (4.3), M2 represents the secret part, M1 signifies the

inclined part, and M represents the original series [90]. The result from the MA filter

can be combined with an AR output to form the ARIMA model. Mathematically, it

is represented as follows:

(4.4)

4.1.2 DISCRETE WAVELET TRANSFORM (DWT)

The second model used for proposed hybrid model is Discrete Wavelet

Transform(DWT). DWT is a type of wavelet transformation, which decomposes a

signal into an essential orthogonal function of different frequencies. The main

feature of DWT is that it is totally lossless transformation. We can regain our

original signal while using reverse DWT. The basic orthogonal functions are

localized in space, which is only a fraction of the total signal length. The DWT wave

function is an expanded, translated, and scalar version of a common function also

known as the mother wave [91].

To decompose any non-stationary signal such as an image, audio, or video signal,

the discrete wavelet transform is introduced. The signal transmission is always

dependent on the minimal waves. DWT differentiates the signal into two major sub-

bands of frequencies, namely, higher late frequency and lower rate frequency. The

information of the edge portions is present in the high-frequency sections, whereas

the low frequency further decomposes into higher and lower frequencies. The water

marking process is generally done on high frequency [92].

53

To achieve decomposition at every point in two-dimensional systems, DWT is

applied in both the vertical and horizontal direction, as shown in Figure 4.1.

L H

Figure 4.4:1 Horizontal Wavelet transforms [93]

As shown in figure 4.1, the wavelets are decomposed horizontally into two sub-

bands namely, L and H. Further, horizontal decomposition is applied on these

obtained wavelets as shown in figure 4.2.

L H

LL1

LH1

HL1

HH1

Figure 4. 4:2 Vertical Wavelet transforms for horizontal wavelets [93]

Four sub-bands, such as LL, HL, LH, and HH exist at the starting level of

decomposition. For each next level of decomposition, the LL sub-band of the

previous level is used as the input. In the second level of decomposition, the LL1

band splits into further four sub-bands named as LL2, HL2, LH2, and HH2.

54

LL1

LH1

HL1

HH1

Original Wavelet signal

LL3

LH3

HL3

HH3

LH2

HL2

HH2

LH1

HL1

HH1

LL2

LH2

HL2

HH2

LH1

HL1

HH1

1d - DWT

2d-DWT

3d - DWT

Figure 4. 4:3 1, 2 and 3-level Discrete Wavelet Decompositions [93]

As shown in figure 4.3, the third level of decomposition, LL2 splits further into four

sub-bands- LL3, HL3, LH3, and HH3. Thus a total of ten sub-bands are obtained.

LL3 contains the lowest frequency sub-band while LH1, HL1, and HH1 contain the

highest frequencies sub-bands. In the areas of image processing, concept of multi-

resolution is present to compress the Wavelet transform. For the de-noising and

compression of image Wavelet transform is used. Wavelet transform is a well-

organized tool to signify required data. The wavelet transform allows multi-

resolution analysis to extract relevant data from a large number of datasets [93].

4.1.3 CUCKOO SEARCH OPTIMIZATION ALGORITHM

Cuckoo Search(CS) algorithm is the third algorithm used for hybrid proposed

algorithm. It a nature-inspired algorithm which is motivated by the reproduction and

egg-laying behavior of cuckoo birds. Cuckoo Search is mainly inspired by the

55

development of cuckoo behavior, i.e., color pattern laying imitated eggs in other

bird's nests. The egg from the nest represents a solution, whereas a cuckoo egg

stands for the obtained solution. The use of the new and better solutions (i.e.,

cuckoos) is then to move the worst solution within the nests, which is this

algorithm's primary target. CS process consists of two components, i.e., local and

global. CS carries out two key features of today's meta-heuristic calculation:

Increase and Expansion. The former is introduced to enhance the best alternative

through a guided random walk, while the latter is designed to maintain population

diversity through Levy flights. The probability (Pa) of changing regulates a balance

between the two stochastic search parts. The algorithm for the Cuckoo Search is as

follows:-

Algorithm: Cuckoo Search (CS)

Initialize the Population of 'N' Nests.

Evaluate Nests.

While (Termination Criteria Not Met)

Randomly Generate New Solution from Best Nest.

Randomly Choose Nest from Population.

If is better than Replace with

Abandon Worse Nests, Replace with Randomly Generated Nests.

This algorithm serves as the base for the desired strategy to solve the problem of

global optimization by making balance among both random walks i.e., local and

global. This balance is managed on the basis of defined sufficient probability

.

× J ( ) ×( ) (4.5)

(4.6)

56

The mathematical representation corresponds to local along with global random

walk and is as shown above in equation (4.5) and equation (4.6) along with Table

4.1 which represents the parameters of Local and Global Walk [94].

Table 4.1 Parameters of Local and Global random walk

Parameters Detail

and The present location was chosen by random permutation

+ve Step size scaling factor

Subsequent location

R Step Size

× Product of two vectors on the basis of entry

J Heavy side variable

Employed to switch between local and global walk

A random number from homogeneous distribution

Levy distribution utilized for a selection of step size of random walk

4.1.4 ARTIFICIAL NEURAL NETWORK

The last technique used for hybridization of the proposed algorithm is Artificial

Neural Network (ANN). ANN is an efficient and successful alternative to ARIMA

models for predicting the time series relation with distinctive features. This

technique helps to increase the prediction accuracy of the designed model [95]. In

this research, a single hidden layer ANN with single output is used. The output of

ANN can be defined as

(4.7)

Where,

(j=0,1,2,3,4,…………….n) (4.8)

(i=0,1,2,3,4,…………….m), both are the weight ,

and Bias value

57

White Noise

h hidden layer activation function of hidden layer

The algorithm for ANN is shown below. The important steps to train the network are

defined below:

Algorithm: Artificial Neural Network (ANN)

1 Start

2 Initialize ANN and define the basic feature as input/training data (T-Data),

Target (TR) and Neurons (N)

3 Set, Model-Net = Newff (T-Data, TR, N)

4 Model -Net.TrainParam.Epoch = 1000

5 Model -Net.Ratio.Training = 70%

6 Model -Net.Ratio.Testing = 15%

7 Model -Net.Ratio.Validation = 15%

8 Model -Net = Train (Model -Net, T-Data, TR)

9 Current Data = Feature of real-time data

10 Prediction = simulate (Model -Net, Current Data)

11 If Prediction = True

12 Results = Show predicted data

13 End

14 Return: Results in terms of prediction

15 End

Hence, the proposed selection of forecasting algorithms for the hybrid forecasting

model consists of the following:

1. Auto Regressive Integrated Moving Average (ARIMA) model

2. Discrete Wavelet Transform (DWT)

3. Cuckoo Search (CS) Optimization algorithm

4. Artificial Neural Network (ANN)

58

The chapter is further divided into sections where each following section discusses

the objectives to be followed for the present study. The specific objectives of this

research are as follows :


2. To determine how discrete wavelet transform resolves the difficulties in

ARIMA modeling.



The above mentioned objectives are discussed in detail in the following sections.

4.2 TO MODEL THE TIME SERIES DATA AND FORECAST FUTURE

VALUES USING ARIMA

In order to achieve the first objective time series data of electricity consumption

which has been collected from PSPCL, Punjab, India is trasfered to ARIMA model

to generate the forecasts of electricity consumption five year ahead from year 2018

to year 2022. The steps mainly involved in the prediction are discussed below;

1. Stationarity check: First, it has to check the given data is stationary or not;

if it is stationary, then the data is directly transferred to the next step. If it is not

stationary, then it's mandatory to perform differencing operation and then it is

checked for stationarity. If the data obtained is still unstable, the distinction is made

continuously unless the data obtained becomes stable. If the difference is made d

times, the integration rule for the ARIMA model is d [96].

59

(a) (b)

Figure 4.4:4 (a) Stationary and (b) Non-stationary series

Fig source: https://www.analyticsvidhya.com/blog/2018/09/non-stationary-time-

series-python/

2. ARMA modeling: Stationary data is transferred to the ARMA time series

model as follows. Assume that the value of the data at any time t is and the

previous p data values are , and the errors in the given time

periods are assumed to be .

Corresponding to this ARMA equation is given in (3.5) below;

( 4.9)

In given equation (4.9), and denotes the coefficients of

autoregressive (AR) and moving average (MA), so the time series model is

expressed as . The procedure of ARIMA modeling is as follows;

(a) Identify the order of model : based on correlation analysis through the

behavior of autocorrelation function (ACF) and partial autocorrelation function

60

(PACF) as discussed in equation (4.2) and (4.3) which denotes the functions of lag

or delay i.e., . Here three cases are included in the first case;

i. If, the sinusoidal delay is represented by ACF, then the PACF approaches to

zero after a lag of (p), then the model produces pure AR process with order .

ii. If, the ACF is zero, then the PACF shows a sinusoidal decay, and the model is

known as the MA model of order q.

iii. If, ACF and PACF consists of sinusoidal decay as well the values of both

approaches to zero after a lag,

then the model is called ARMA process having the order of and ARIMA

model includes an order of

(b) Estimation of model coefficients: Through the Box-Jenkins scheme, the

coefficients of the model could be calculated. Gaussian maximum likelihood

estimation (GMLE) methods are usually utilized to estimate the variables of the

ARIMA model. After the estimation of data from the time series, the model must be

validated. This test for diagnosis is based on the sequence of error analysis. The

model can be tested by evaluating the ACF of the data from this error series and

checking whether they are in the 99% confidence interval. Some other tests can be

performed without the use of residual ACF to validate the model. Another test

scheme is the Ljung and Box test, and there are many parameters for estimating the

accuracy of the method, including the Akakine Information Criteria (AIC) and the

Bayesian Data Criteria (BIC) [96].

3. Data forecasting: It is used to evaluate Time Series Data after the reliability

of the model has been confirmed. The following Time Series Data values are

calculated by utilizing all expected model parameters and available Time Series

Data values. Different data must be combined to return raw data forecasts. Thus, this

model is called “Auto-Regressive Integrated Motion”(ARIMA) and is used in linear

Time Series Data forecasting with improved linear accuracy; [97].

61

4.3 TO DETERMINE HOW DISCRETE WAVELET TRANSFORM

RESOLVES THE DIFFICULTIES IN ARIMA MODELING

In the previous section, the working and overall architecture of the time series and

ARIMA model was discussed. This section discusses about how DWT resolves the

difficulties of ARIMA model while performing the forecasting. A forecasting model

is developed by combining two techniques ARIMA and DWT and its accuracy is

compared with forecasting model developed using ARIMA model. The difficulties

of ARIMA model while forecasting and how DWT resolves these difficulties is

discussed in detail as follows:

4.3.1 Difficulties that were resolved by combining ARIMA model with DWT

Difficulty 1: ARIMA do not involve decomposition technique

Explanation: The main aim of analyzing time series data is to establish a forecast

model, which can be able to predict future values based on the past experience. Due

to the difficulty of evaluating the particular nature of a time series data, generating

adequate forecasts is often considered challenging. Various predictive models from

ARIMA and ANN have been introduced in existing works. ARIMA forecasting

model is well-known for its remarkable predictive accuracy and flexibility in

representing various types of time series. However, a significant limitation is the

probable linear form of the related data, which makes this shortcoming of ARIMA

models unsuitable for complex nonlinear time series modeling. To overcome this

difficulty, DWT is required to divide a large amount of time series dataset into two

sub-parts; detailed (linear) as well as approximate (non-linear) [79].

Difficulty 2: Noise from datasets cannot be removed by the ARIMA model

Explanation: Wavelet analysis can filter noisy signals i.e. identify the trend of

variation and the fluctuation of data from the time series. Wavelet decomposition

and reconstruction reduces the time series data non-stationary and thus improves the

prediction accuracy. The decomposition of the wavelet is applied as a de-noising

technique [98].

Difficulty 3: The cyclicality of the time series data cannot be reduced by ARIMA

62

Explanation: The cyclicality of the time series data can be reduced through

decomposing the data into high-frequency data segment and low-frequency data

segment [99] because the frequency is the rate by which the data can be modified.

The high-frequency data segment changes rapidly according to the time frame,

whereas the low-frequency data segment is not changing rapidly as per time.

Difficulty 4: ARIMA model produces limited accuracy in the small dataset.

Explanation: The limitation of ARIMA is to produce accurate forecasting for a short

time period. It also has the drawback of the ARIMA model that it needs a minimum

of 50 and preferably 100 observations or higher than this [97]. Whereas DWT does

not need a large number of datasets, it can model the individual stationary process as

well as components [100].

Difficulty 5: There is no automatic updating feature. As new data become available,

the entire modeling procedure must be repeated. Different models are needed to be

built from scratch for a new dataset in ARIMA

Explanation: There is no need to build the forecasting model for the updated new

dataset from scratch in DWT, unlike ARIMA modeling [101].

Difficulty 6: Estimation of parameters of p, d, q is a time-consuming process in

ARIMA (p, d, q).

Explanation: An ARIMA (p, d, q) forecasting model is necessary to estimate the

parameters (p, d, q) opposite from this DWT does not require any kind of parameter

estimation [102].

Difficulty 7: Extreme variations and fluctuations occur with high frequency in the

ARIMA model.

Explanation: The extreme variations, as well as fluctuations that occur with high

frequency, tend to the increased risk of error and information loss while performing

forecasting for future data. By decomposing the time series data into low as well as

high-frequency components by utilizing DWT, it makes it possible to recover the

original time-domain signal without losing information [103].

63

Time Series Data

Discrete Wavelet Transform

Lower limit (LL) Upper limit (UL)

Auto Regressive and Moving Average

Forecasted Value

Figure 4.5 Forecasting model developed by combining ARIMA and DWT

This forecasting model has been made by combination of Wavelets such as HAAR

and Daubechies with ARIMA model. Initially, the wavelet models analyzes the time

series data, and the output of this phase passes to the second phase i.e. ARIMA

model to convert nonlinear data into linear data as shown in figure 4.5.

4.3.2 Use of DWT in proposed hybrid model

Time period and frequency domain series are ideal for non-stationary forecasting

processes where the feature and mean of autocorrelation is not stable over time. The

electricity data is non-stationary as its consumption varies continuously over time.

Therefore, DWT is the best way to express this type of data.Wavelet function that

has been created from a single input is given by equation (4.10).

(4.10)

c Scaling factor utilized to calculate the compression value

64

d translation parameter to compute the location of the wave time

Here, one out of the following two conditions can be true i.e. If | c | <1, then the

wave is a compressed version that is connected to higher frequencies (multiple time

cycles). On the other hand, if | c |> 1, then the time band of this function is

greater than , which is directly related to low frequencies.

Thus, DWT is considered as a time series analysis tool, which computes time series

data using wave and generates a discrete signal. It is based on the coding of a sub-

band and the rapid computational process of wave conversion. The discrete wavelet

function can be expressed by equation (4.11).

(4.11)

4.3.3 Use of ARIMA model in proposed work

The ARIMA model was designed by combining AR and MA models, which are

being performed by Jenkins. Mathematically, it can be written by equation 4.12.

Zt=K+ϑ1 zt-1+ϑ2 zt-2+⋯+ϑm zt-m+nt (4.12)

Where ……. intercept term at the first, second, and the last position

respectively.

k Constant

White Gaussian Noise

The analyzed values using equation (4.12) are often used for the prediction of the

next data. However, MA, the model can be mathematically presented by equation

(4.13).

(4.13)

MA represents the regression, which represents the lagged error value. For the data

decomposition, MA filter helps and the equation for fth order of MA filter is written

by equation (4.14).

65

(4.14)

Where, f=2i+1. As a result, time series (t) was obtained with an average period of I

to obtain the trend period. The obtained observation values appear to be close to the

random exclusion value. However, M2 is a harmful component obtained in the form

of .. In equation (4.15), M2 represents the secreted part, M1 signifies the

inclined part, and M represents the original series [104]. The result from the MA

filter was combined with an AR output to form the ARIMA model.Mathematically,

it is represented as follows:

(4.15)

The performance parameters of the ARIMA model are represented as

.

4.3.4 DWT and ARIMA Model

Discrete Wavelet

Transform

Lower limit (LL)(a1,a2,a3,….an)

Upper limit (UL)(b1,b2,b3,…. bn)

ARIMA

Figure 4.4:5 DWT and ARIMA Model

Figure 4.7 shows a Wavelet analysis of time series data in which the previous year,

data related to the Punjab electrical board was taken as an input ‘Si’ signal. The

lower pass filter passes lower signal values by blocking the high frequency signals.

66

Autoregression represented by AR model and the moving average by MA model,

and ‘d’ signifies the lag value from previous years.

For predicting future values based on past experience, three types of processes are

followed:

(a) AR process

(b) Differencing in value

(c) MA part

This ARIMA model has been used to forecast energy consumption. However, the

developed model is used for various applications to forecast weather data, climatic

conditions, and so on.

Input(Si)

Lower Pass Filter

High Pass Filter

Down Sampling of lower values

Down Sampling of higher values

Lower range (LL)

Higher range (UL)

Figure 4.4:6 Discrete Wavelet Transform Output

The products from the filters are fed for further reduction of data into a small size.

The upper block converts the lower values to different attributes such as a1, a2,

a3,………an. Similarly, the higher block converts the higher values to attributes like

b1, b2, b3………bn. Thus, the lower and upper rows of linear time series data are

briefly adjusted. The linear time series data obtained during DWT output was fed to

the ARIMA model. The ARIMA model was applied in the following steps:

1. The raw time series data is applied to the DWT unit. DWT processes the

input signal to set the predicted value range.

67

2. The coefficients b1, b2, b3………bn. , represents the upper limit of the time

series and the coefficients a1, a2, a3,………an depicts the lower limit of the

TSD.

3. DWT function obtained as is directly fed to the ARIMA model.

4. The regression value is calculated from Equation 4.12 and the MA value is

calculated from Equation 4.13. These equations calculate AR (m) and MA

(r). In addition, the MA filter changes the trend part and removes the part

from the original TSD information. The filtered values have been determined

using equation 4.14.

5. The MA and generated regression values are analyzed using the ARIMA

model.

6. Finally, compute the forecasted values

7. Compute the performance parameters.

4.3.5 HAAR Wavelet

The main properties of a wave transformation are that it converts a non-stationary

time series data into a fixed time series relative to the original sequence. It is formed

by connecting signals orthogonally. The HAAR wave is derived from a group of

functions and is a single rectangular wave that supports the domain in the range

A∈ [0,1].

Because of and the (ω) in the point ω =0 and

only has one degree zero. determines the family of simplest orthogonal

normalized wavelet family in the multi-resolution system i, i.e., not only

orthogonal to 2i A) but also perpendicular to the integer displacement of their

own data Mathematically, the HAAR wavelet

function can be described in equation(4.16) below:

68

(4.16)

The HAAR wavelet is equal to 1 in the range of [0, 0.5] and for other intervals

[0.5,1] it is 0. The original data affects the function value of the HAAR wave; the

value of the sequence data affects the similar absolute value of +ve and -ve function

values, these steps can be substituted for frequent repetitions of the original data in a

very short period of time, whereas the data will not have frequent shocks [105]. This

is the principle of using the HAAR wave to reduce data in a time series.

4.3.6 The requirement of DWT in time series forecasting

By using DWT, the boundary i.e., upper as well as lower bound of datasets, can be

easily determined. Wavelet decomposition is combined with time series models as a

pre-processing technique to decompose the datasets. Wavelet decomposition breaks

down time-series data into approximation and components so that different

forecasting models can be applied to every component, and it also separates data

into various series. Wavelet decomposition shows better forecasting performance

after decomposition i.e. it breaks down data through wavelet decomposition and

obtain four decomposed series that have to be predicted by using ARIMA. After

applying the ARIMA model, the inverse wavelet transform combines the

decomposed value. After studying various papers, the DWT decomposition

mechanism is helpful in various areas of forecasting, like oil price prediction, stock

price estimation, wind speed prediction, and load price forecasting. The wavelet

decomposition properties extract low and high-frequency components from the

original data and allow each component to be analyzed easily. ARIMA, GARCH,

and wavelet decomposition are linear models, and ANN is a non-linear model. The

forecasting power has been rising consequently by applying the combination of

these models [106]. Due to this reason in this research work, the outcomes have

been analyzed in three phases; firstly, by applying only the ARIMA forecasting

model, then ARIMA is integrated with DWT and finally the proposed hybrid model

that is the combination of ARIMA, DWT, CS and ANN. By applying the DWT with

ARIMA, the performance of forecasting has been enhanced as compared to only the

ARIMA model. The decomposition property of DWT can enhance the performance

69

of forecasting; the validity of this approach has been proved through related existing

work in table 4.2.

Table 4.2 Existing DWT based work

References Proposed Work

[107]

The future market prediction has been performed using a wavelet

decomposition approach. The future prediction has been performed

based upon the prior values of the West Texas Intermediate spot

market. The test results show a better relationship between the

forecasted price and the actual price.

[108]

The researchers have designed a forecasting model for predicting

electricity prices using ARIMA and GARCH. As all three models are

linear models and the results depict better relationships among the

forecasting values.

[109]

Wavelet decomposition along with the ANN approach has been used

for predicting solar radiation from the year 1981 to 2001. The

proposed work is better compared to the existing state of art

approaches.

[110]

Wavelet decomposition, wavelet packet decomposition, and ANN

have been used to predict the speed of the wind. The designed model

has been compared with the existing ARIMA, ARIMA with ANN,

and Neuro-Fuzzy models, and the wavelet packet Broyden-Fletcher-

Goldfarb-Shanno provided better forecasting results.

[111]

Implemented the wavelet decomposition, decomposition of the

wavelet packet, and the neural network has been applied to predict

the wind speed. The outcomes have been analyzed by using ARIMA,

ARIMA with ANN, and Neuro-fuzzy approach has been contrasted

with existing models. The presented wavelet-based packet broyden

Fletcher Goldfarb shanno produces enhanced forecasting outcomes.

[112]

In this work, integrated wavelet decomposition along with the neural

network to estimate the Mackey glass time series and sunspot data.

The outcomes of prediction depicted an enhanced accuracy as a

70

contrast to previous models.

[113]

Have implemented the fuzzy wavelet decomposition for prediction of

IBM daily prices, daily index values. In this work, wavelet

decomposition produces better results when the noise has been

removed means when de-noising has been applied.

[114]

In this work, the authors used the wavelet transform, along with the

ARIMA forecasting model and radial basis function neural network

(RBFN) for prediction of electricity price. The price behavior of

electricity has been treated as a non-linear function that needs a non-

linear model to record the behaviors of price. The decomposed of

price data has been done by using wavelet decomposition into four

sub-parts. These decomposed series has recognized through different

models of ARIMA. After that, the inverse wavelet transform has

applied, and the RBFN network used to verify the errors of the

wavelet-ARIMA predictor. After analyzed the performance of this

work with the latest price prediction approach, the current work

provided significant improvement.

[115]

An adaptive wavelet neural network has been used for short-term

(ST) price forecasting in the market of electricity. The proposed

model has been first introduced as an alternative for a traditional

FFBPNN to estimate arbitrary non-linear functions. In this work, by

using the wavelet-based neural network, the performance has been

enhanced.

As given in the table, the performance of work has been enhanced by using a

decomposition approach combining with different forecasting models. Therefore, it

has to be concluded on the basis of the above discussion that the DWT technique

minimizes the error as well as increases the performance of work. After the DWT

approach, lower as well as the upper bound of the related dataset has been obtained

and hence becomes easy for the researchers to estimate the future electricity

consumption. DWT also helps to pre-processed large amount of datasets without the

filtering approaches.

71

4.4 TO DEVELOP A HYBRID MODEL OF ARIMA AND WAVELET

TRANSFORM

In order to estimate the consumption of electricity in future the present work is

performed using following steps. The datasets used for this purpose are obtained

from PSPCL, India. The flowcharts along with proposed work methodology in

described in the steps provided below;

Step 1: This research utilizes data obtained from Punjab State Power Corporation

Limited (PSPCL).

Step 2: In second step Discrete Wavelet Transform (DWT) is applied. DWT is a

discretized continuous wavelet transform (CWT) that breaks down the time series

data into an integer number of data samples.

Step 3: DWT decomposes the complete range of electricity consumption into four

distinct categories, including with different ranges i.e., (LL),

(LH), (HH), (LH).

Step 4: ARIMA model is applied individually on obtained decomposed low as well

as high frequency to get the time series data.

Step 5: After applying ARIMA models individually, step 5 combines Moving

Average and Autoregressive analysis using integrated (I) element of ARIMA to the

time series data.

Step 6: Applies Inverse (I) DWT to the outcomes of these distinct four ARIMA

models to combine them. It builds a well-recognized record that is helpful in

providing the training of the proposed model.

Step 7: At the classification phase, the ANN classifier requires higher uniqueness as

well as accurate data for their training. To get the best record of the previous

month’s electricity consumption, CS optimization strategy including novel fitness

function is applied.

72

Step 8: If the uniqueness of the data record is higher, then the training of the system

becomes easier and fast. So, this unique data record obtained through CS

optimization is transferred to the classification phase i.e., ANN as an input of the

training set. In the end, ANN is used to train the proposed model by which it

becomes helpful to forecast the upcoming month on the basis of electricity

consumption.

4.5 TO ANALYZE THE PERFORMANCE AND ACCURACY OF THE

PROPOSED ALGORITHM

In order to achieve the last objective i.e. analysis of the performance of the proposed

work all the experiment has been conducted in MATLAB simulator by using the

real-time data obtained from PSPCL. The actual datasets have been taken from the

previous year's electricity consumption. These real datasets have been decomposed

into four ranges of frequencies, such as (Lmin to Lmax, Lmax to Hmin, Hmin to Hmax,

Lmax to Hmax) by utilizing DWT. To forecast the future consumption of electricity,

the actual datasets need to be converted into time-series data (TSD). For this

purpose, the ARIMA model is utilized individually, corresponding to these

decomposed datasets. The main aim of decomposing the datasets is to get the

information about maximum and minimum utilization of electricity from previous

years. The obtained time series data by applying the ARIMA model is composed of

a number of information about their samples like linear as well as non-linear

information. From previous research, there are variety of approaches such as

Artificial intelligence (AI) based methods like SVM and ANN that have been

proposed. The forecasting model designed using these techniques have been

discussed in chapter 2 to estimate the demand for optimal electricity. As per the

existing work, they have been utilized for prediction of electricity consumption in

the future. But in present work, instead of utilizing these techniques separately, a

hybrid model has been proposed for prediction of electricity consumption in

upcoming days. From the experiments as discussed in Chapter 5, it has been

concluded that the accuracy obtained after the implementation of different

approaches is as shown below.

i. Forecasting Accuracy of ARIMA model is 83.53%

73

ii. Forecasting accuracy of Hybrid model of ARIMA and Daubechies Wavelet

transform is 92.67%

iii. Forecasting Hybrid model of ARIMA and HAAR Wavelet transform is

93.76%

iv. Forcasting accuracy of Proposed Hybrid model is 98.86 %.

74

CHAPTER 5: RESULTS AND ANALYSIS

This research work focuses on the prediction of electricity consumption in specific

industrial region of Punjab, India. To forecast the amount of electricity consumption

in the future, actual electricity consumed by consumers from the previous year has

been utilized. This dataset of electricity consumption is taken from the power system

of the public utility company Punjab State Power Corporation Limited (PSPCL)

from year January 2013 to December 2017. The detailed explanation of datasets has

been provided below in tabular form as well as in the form of a graph.

Figure 5:1 Dataset consumed electricity from January 2013 to December 2017

As per the given table 5.1 and figure 5.1, it can be said that increasing year leads to

an increased amount of electricity. The graph shows the lowest electricity

consumption in February and March 2013 along with the highest consumption of

electricity in July 2017. In the figure, the Y-axis represents the consumed electricity

by consumers which corresponds to different years. The X-axis depicts the years

from January 2013 to December 2017. The average consumption of electricity from

the years January 2013 to December 2017 is 3464988567.

75

5.1 DATASET USED

This research work focuses on the one specific industrial region of India named

Punjab for the prediction of electricity consumption in the future. To forecast the

amount of electricity consumption in the future, actual electricity consumed by

consumers from the previous year has been utilized. This real dataset of electricity

consumption taken from the power system of the public utility company Punjab

State Power Corporation Limited (PSPCL) from year January 2013 to December

2017. The detailed explanation of datasets has been provided below in tabular form.

Table 5.1 Dataset Used

Years (January 2013 –December 2017)

Original Electricity Consumption (KWh)

Jan-13 2402929222

Feb-13 2302632737

Mar-13 2302632737

Apr-13 2315442012

May-13 2763422169

Jun-13 3409077307

Jul-13 4283260815

Aug-13 4203919034

Sep-13 4320843017

Oct-13 3284669791

Nov-13 2660633086

Dec-13 2510323489

Jan-14 2406670480

Feb-14 2543659240

Mar-14 2262755935

Apr-14 2229496325

76

May-14 2663094360

Jun-14 4135881561

Jul-14 5048078068

Aug-14 5150618207

Sep-14 4184285964

Oct-14 3525215760

Nov-14 2914127320

Dec-14 2691580750

Jan-15 2425671683

Feb-15 2705001417

Mar-15 2387522949

Apr-15 2253062900

May-15 2884345926

Jun-15 3461754270

Jul-15 4895975328

Aug-15 5076894389

Sep-15 4867004448

Oct-15 3787014752

Nov-15 2804034438

Dec-15 2719630627

Jan-16 2567913778

Feb-16 2740311321

Mar-16 2942162298

Apr-16 2522244918

May-16 3253239971

Jun-16 4500446696

77

Jul-16 5018252044

Aug-16 4999539492

Sep-16 5404195697

Oct-16 3958725442

Nov-16 3130248433

Dec-16 2921627225

Jan-17 2902848056

Feb-17 2958390111

Mar-17 3209855845

Apr-17 2782373785

May-17 3705371880

Jun-17 4450434098

Jul-17 5810138389

Aug-17 5645879830

Sep-17 5273395962

Oct-17 4196118613

Nov-17 3191950404

Dec-17 3030487191

In table 5.1 gives the actual consumption of electricity from January 2013 to

December 2017. The average consumption of electricity from the years January

2013 to December 2017 is 3464988567.

5.2 RESULTS AND DISCUSSION

This section discusses the simulation of work that is done by using MATLAB

simulator to forecast the electricity consumption by using different techniques.

ARIMA is one of the most used forecasting models that is applied for prediction in

78

previous studies. Along with ARIMA, the utilized methods are further provided with

the simulation diagram.

Figure 5:2 Data panel

Figure 5.2 shows the data panel of MATLAB simulator which uses a single ARIMA

model. This interface is divided mainly into two regions; on the upper level of the

data panel, there are five buttons that provide the functionality for the simulation

work. The names of the buttons are self descriptive and are numbered in a sequence

in which the work is done. The buttons are (a) Select the dataset (b) Convert to

stationary (c) Generate the hypothesis (d) Model the ARIMA (e) Predict Next.

The working of each button is discussed below in detail.

79

Figure 5:3 Upload Data

Figure 5.3 depicts the process of uploading datasets in graphical form. The data has

been taken from previous years of electricity consumption. This graph along X-axis

and Y-axis correspondingly represents the number of days and consumption of

electricity in KWh (Kilowatt-hour). The obtained peak point of the waves in the

graph represents the average consumed electricity for the corresponding number of

days. As per the figure, it is clear that the consumption rate of electricity is enhanced

with the increasing months.

After successfully uploading the dataset into the panel, it has to be converted into

stationary for prediction purposes. There are two types of datasets available such as

stationary and non-stationary these concepts are essential in time series forecasting.

The stationary datasets are best suitable for forecasting in the future because this

kind of data has fixed characteristics for some specified time interval.

Stationary data: This type of data does not include any downward as well as an

upward trend or seasonal effects on it. The mean and variances of data must be

consistent over the infinite duration of time.

80

Non-stationary data: Data show patterns, seasonal effects, and other time-based

structures. The efficiency of forecasting is based upon observation time. As the time

increases, mean and variance changes, and trends are captured in the model. In this

forecasting based research work, data has to be converted into stationary.

Figure 5:4 Convert to stationary

Figure 5.4 shows the dataset after converting it into stationary. In this graph, three

kinds of waves are represented by different colors i.e. black, green and red. Among

them black is depicted as the actual previously consumed data. The waves

represented through the red line are depicted that value that makes a difference

among original and converted into stationary corresponds to the previously datasets

of consumed electricity.

The term autocorrelation defines the analysis of a process based on time. The

inference drawn as per autocorrelation function is typically known as analysis in the

time domain. The plotted graph of autocorrelation depicts the features of data used

for times series analysis. Particularly, the autocorrelation is designed to show

81

whether the elements of a time series are positive as well as negatively correlated or

independent to each other. The autocorrelation against lag by using different

algorithms such as ARIMA, CS and ANN, and Proposed Hybrid Model is depicted

below.

Figure 5:5 Generated hypothesis

Figure 5.5 shows the generated hypothesis of data based on autocorrelation. The

plotted graph shows the value of the autocorrelation function corresponding to the

vertical axis. It can range from -1 to 1 but in our case it varies for -0.4 to 1. In the

horizontal axis, the size of lag between elements of the time series is presented. Here

the lag is varied from 0 to 20. For example, if the lag is 2, the correlation between

time series elements and corresponding elements that were observed is supposed to

be two time periods earlier.

82

Figure 5:6 Original consumed and predicted electricity using ARIMA

In figure 5.6 the predicted value of consumed electricity by using ARIMA along

with the original value are shown. Red-colored waves in the graph depict the

original value of consumed electricity ranges from 0 to 140 along the horizontal

axis. The predicted amount of electricity is represented by black colored waves

within the same ranges of horizontally and vertically. In the graph, the number of

reading for electricity is represented along the x-axis and consumed as well as

predicted electricity is depicted along the y-axis. Red-colored waves represent the

predicted electricity at the same time interval and the number of entries by which we

analyze the performance of the presented model. The predicted electricity is not

much different from the originally consumed electricity. Therefore, in this case, the

accuracy is higher on the next phase the ARIMA forecasting model is used to

predict.

In figure 5.7, the forecasting value of consumed electricity is given by using

ARIMA model only.

83

Figure 5:7 Next Predicted electricity using ARIMA

After applying the ARIMA forecasting model to forecast the future electricity

consumption, optimization and validation technique were applied in next step.

Figure 5:8 Data panel of DWT - ARIMA

84

The data panel for applying the said algorithms with the ARIMA model is shown in

figure 5.8. After the selection of data, it has to be decomposed by applying a HAAR

wavelet transformation. Further steps such as convert into stationary datasets

generate hypothesis; a model by using the ARIMA forecasting model is discussed

below one after another.

Figure 5:9 Data uploading again

Figure 5.9 shows the results achieved after the data uploading is done again to utilize

the next operation of datasets such as decomposition of datasets, get the optimized

value of data.

85

Figure 5:10 Decomposition of data using Haar wavelet

In figure 5.10 the data decomposition process results after using HAAR wavelet

transformation is shown. DWT breaks down the signal into sub-bands of higher and

lower frequencies based on LPF and HPF. Whereas the low frequency and high

frequency further decompose into higher and lower frequencies, such as there are

four sub-bands; LL, HL, LH, and HH. These four decomposed sub-bands

represented through different colored in a graph such as red, blue, black, and green.

These four sub-bands are discussed below, along with their generated hypothesis

and predicted electricity consumption using the ARIMA forecasting model.

86

Figure 5:11 Decomposed (LL) data converted into stationary

Figure 5.11, shows the decomposed Lmin to Lmax (LL) datasets converted into

stationary form. The consumption of original electricity as well as predicted in the

range of is plotted along the Y-axis of the graph. The total number

of entries in the range of 0 to 70 is represented through X-axis. In graph three, waves

are plotted in three different colors, such as black, green, and red. These three

different colors of datasets depict distinguished properties of data. Among them, the

black colored waves show the original consumption of data. The forecast electricity

consumed is represented through green-colored waves. The red-colored waves show

the obtained rendered value in between original and corresponding predicted values.

As depicted in the figure 5.11 predicted value is far away from the original

consumption of electricity. Therefore, it is clear that the accuracy of LL decomposed

datasets is low. So, the results predicted from another decomposed datasets are

utilized for prediction.

87

Figure 5:12 Generated hypothesis of LL decomposed datasets

The generated hypothesis based on sample autocorrelation and lag of LL

decomposed datasets is shown in figure 5.12. The lag and sample correlation is

represented along Y-axis and X-axis correspondingly. The relation among originally

consumed electricity and the hypothetical value of electricity is represented through

autocorrelation. The black horizontal line is representing the originally consumed

electricity. Both upper and lower colored line represents the higher as well as, the

lower limit of hypothetical value. The red vertical line represents the obtained

sample autocorrelation corresponding to every lag. The obtained autocorrelation

corresponds to some lag that can be negative as well as positive. More than half of

the obtained autocorrelation values is far from the original electricity.

The predicted electricity using LL ARIMA is shown in figure 5.14, in which the

number of readings is represented along X-axis. The original, as well as the

predicted value of electricity, is plotted vertically. The two-colored waves in the

graph, i.e., red and black, are depicted as the original as well as predicted,

corresponds to that originally consumed electricity.

88

Figure 5:13 Original consumed and predicted electricity using LL ARIMA

The electricity representing line ranges from (-6 to 8) X 109 and the amount of

reading varies from 0 to 30. The predicted value of electricity is near about original

consumption of electricity. The highest and lowest variation among original and

predicted electricity corresponds to 15th and 19th number of readings. If the variation

of original and predicted electricity is lower than the performance is enhanced.

Figure 5:14 Decomposed (LH) data converted into stationary

89

In figure 5.14 shows the decomposed Lmax to Hmin (LH) datasets converted into

stationary form. The consumption of original as well as predicted electricity is

represented vertically, ranging from (-1.5 to 1) X 105. The consumed and predicted

electricity graph is plotted against the number of entries. The original consumed,

predicted, and rendered electricity is represented by three different colored waves,

such as black green and red. As the increasing number of entries, the value of these

three consumed, predicted, and rendered value of electricity is becoming

approximately equal.

Sa

mp

le A

uto

corr

ela

tion

Figure 5:15 Generated hypothesis of LH decomposed datasets

Figure 5.15 depicts the generated hypothesis of LH decomposed datasets based on

obtained autocorrelation. The black horizontal line depicts the originally consumed

electricity. The blue line represents the correlation value corresponding to each

value and a red vertical line represents every lag. In the figure, the value of some of

the autocorrelation is the same as the original consumed electricity value. However,

some autocorrelation is positive which means higher than the original value as well

as negative.

90

Pre

dic

tions

Figure 5:16 Original consumed and predicted electricity using LH ARIMA

Figure 5.16 shows original as well as predicted electricity using LH ARIMA. The

higher variation among predicted and the original value corresponds to the 16th and

14th number of readings.

Figure 5:17 Decomposed (HL) data converted into stationary

91

The decomposed HL datasets converted into stationary, as shown in figure 5.17. The

consumed, predicted, and rendered datasets are represented through black, green,

and red colored waves. The highest difference between these three values of

electricity is higher at the 120th number of entries. In which the value of predicted

electricity is negative, whereas there rendered electricity is negative.

Figure 5:18 Generated hypothesis of HL decomposed datasets

Figure 5.18 shows the generated hypothesis of HL decomposed datasets based on

lag and corresponding autocorrelation value. In which almost all obtained

autocorrelation is the same as originally consumed electricity. Some autocorrelation

is negative, as well as positive.

92

Figure 5:19 Original consumed and predicted electricity using HL ARIMA

The originally consumed electricity as well predicted value of electricity is shown in

figure 5.19. The highest and lowest variation among predicted and original

consumed electricity corresponds to the 29th and number of readings.

Figure 5:20 Decomposed (HH) data converted into stationary

93

Figure 5.20 shows the decomposed HH data converted into stationary. In which the

consumption of energy and the total number of entries is represented along vertical

and horizontal axis respectively. Graph shows black, green, and red colored waves

that represent consumed data, rendered value and predicted electricity consumption,

respectively.

Figure 5:21 Generated hypothesis of HL decomposed datasets

In figure 5.21 the generated hypothesis of HL decomposed datasets is shown. This

hypothesis is produced based on sample autocorrelation value in the graph along Y-

axis. The graph of sample autocorrelation is plotted against the lag along the X-axis.

As shown in the figure, the horizontal black colored line depicts the originally

consumed electricity. The red-colored vertical line, as well as dots overlapped on

black colored horizontal-line, represents the value of autocorrelation for

corresponding lag. Here, almost all the values of autocorrelation is overlapped on

originally consumed electricity. Therefore, the generated hypothesis is almost equal

to the original electricity value. It is evident from figure 5.21 that the accuracy of

this sub-band is higher than the other three sub-bands such as LL, LH, and HL.

The original, as well as predicted value, corresponds to the actual electricity value

that has been provided in figure 5.22. As depicted in the figure, the ranges of

electricity both actual consumed and predicted are (-6 to +6) X 10-7 along Y-axis.

0 2 4 6 8 10 12 14 16 18 20 Lag

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Sam

ple

Auto Correlation:

HH

94

Figure 5:22 Original consumed and predicted electricity using HH ARIMA

Figure 5.22 represents the original consumed and predicted electricity using HH

ARIMA, in which two colored waves i.e., black and red are shown, the consumed

electricity is represented through red-colored and black colored waves depict the

predicted value of electricity. The number of readings plotted along the X-axis

varies from 0 to 30.

Figure 5:23 Inverse-DWT

95

After applying the individual ARIMA model on decomposed sub-bands, i.e., LL,

LH, HL, and HH, the separated ARIMA model has to be combined by using the

Inverse-DWT as shown in Figure 5.23. In figure 5.24, the applied Inverse-DWT is

depicted by a combining ARIMA forecasting model with the next predicted

electricity value. This obtained forecasting value of electricity through combination

of four decomposed sub-bands is plotted in a graph in which the predicted electricity

value is represented along the Y-axis in the range of (0 to 16) x 109. The number of

entries is plotted along the X-axis varies from 0 to 60. The highest, as well as the

lowest obtained value of electricity, corresponds to the 29th and 31st entry,

respectively.

Figure 5:24 Data Panel for Daubechies Wavelet

Daubechies wavelet approach has also been used to decompose data. It is type of

orthogonal wavelet through which maximum number of vanishing moments has

been selected by using a scaling function also known as father wavelet. The main

advantage of segmentation is to reduce the error rate while performing the

predictions.

96

Figure 5:25 Segmentation done using Daubechies Wavelet Transform

The segmented data using Daubechies as a wavelet transform is shown in Figure

5.25. Using this approach, the Time Series Data is divided into number of small time

frames each comprising of identical features. The segmentation has been performed

linearly, which reduces the loss of data by avoiding the overlapping of small data by

the large data.

Figure 5:26 Differencing Applied on non-stationary data using Daubechies for LL

97

Differencing is applied to non stationary Time Series Data to convert it into

stationary Time Series Data. Among the four decomposed sub-parts of Daubechies

wavelet, the obtained differentiated value of LL is shown in figure 5.26. The

differentiated value of LL is depicted along the y-axis against the total number of

entries in the x-axis. There is a certain amount of distinction between the consumed,

rendered and adjacent rendered value of consumed electricity.

Figure 5:27 Differencing Applied on non-stationary data using Daubechies for LH

The case when the differencing is applied to the LH part of the non-stationary data is

depicted in Figure 5.27. The rendered data is the stationary data obtained after

differencing is applied on the 4th segment of the original data.

Figure 5:28 Differencing Applied on non-stationary data using Daubechies for HL

98

The differencing approach results obtained after the implementation of the HL part

of the non-stationary data is shown in Figure 5.28. In this, we convert the time series

into a stationary one using the difference. Differencing is a way to turn a non-

stationary Time Series Data into a stationary period. This is a significant step, which

must be followed before the preparation of data applied to the ARIMA model.

Figure 5:29 Differencing Applied on non-stationary data using Daubechies for HH

Consumption data is the data that is evaluated after the application of DWT.

HH represents a higher minimum and higher maximum value. Figure 5.29 shows the

obtained differentiated consumed electricity by using the Daubechies for HH applied

on non-stationary data. The obtained curve of consumed, rendered and adjacent

rendered electricity is representing that there is no significant difference among

them. The rendered plot is after the removal of the linear trend out of HH. The same

goes for LH, LL, and HL.

99

Figure 5:30 LL-Autocorrelation plot for Daubechies

An autocorrelation plot for the LL band of the Daubechies approach is shown in

Figure 5.30. The graph represents the time series relationship of the uploaded data.

In the graph, the vertical line has been drawn corresponding to each lag. From the

graph there is a vertical line with respect to each lag. The height of every vertical

line corresponding to each lag is representing the obtained value of the

autocorrelation for that lag. The autocorrelation corresponding to zero lag is always

equal to 1 due to the reason that it depicts the autocorrelation of each term among

itself.

100

Figure 5:31 LH-Autocorrelation plot for Daubechies

Autocorrelation graph plotted for LL, LH, HL and HH for different Lags is shown in

Figure 5.30, Figure 5.31, Figure 5.32 and Figure 5.33 respectively. It is clearly seen

that the values that is close to 0 represents higher correlation compared to the values

that are higher than the 0.5. Here, a boundary line has been set and those values that

come inside is considered for the training process. Using this approach, the training

process becomes much accurate and hence increases the prediction capability of the

design model.

101

Figure 5:32 HL-Autocorrelation plot for Daubechies

The sample auto-correlation determined for HL Daubechies approach lies between

+ 0.7 to -0.5 as shown in Figure 5.32.

Figure 5:33 HH-Autocorrelation plot for Daubechies

In figure 5.33, the HH-autocorrelation is plotted for daubechies. The ranges of

obtained autocorrelation sample corresponding to the lag is between +0.5 to -0.5.

102

Figure 5:34 ARIMA applied on LL segment of Daubechies for segment-based predictions

The predicted graph obtained for designed ARIMA model using Daubechies wavelet

for LL, LH, HL and HH segment is shown in Figure 5.34, Figure 5.35, Figure 5.36

and Figure 5.37 respectively.

Figure 5:35 ARIMA applied on LH segment of Daubechies for segment-based predictions

In figure 5.35 the applied ARIMA model on LH segment of daubechies for segment-

based predictions is hsown. The prediction is plotted vertically against number of

readings.

103

Figure 5:36 ARIMA applied on HL segment of Daubechies for segment-based predictions

Figure 5.36 depicts the segment-based predictions by applied daubechies on HL-

segment. In this graph predictions and number of readings are plotted vertically as

well as horizontally.

Figure 5:37ARIMA applied on HH segment of Daubechies for segment-based predictions

Figure 5.37 shows the predictions on the basis of segment by applied ARIMA model

on HH-segment. The curve of black and red colored depicted the ARIMA and

original consumption of electricity.

104

Figure 5:38 Segmentation of Time Series Data using HAAR decomposition fed to cuckoo search and further to NN

From previous experimental results it was observed HAAR wavelet transform

performed best. So, for accurate forecasting results, cuckoo search optimization

algorithm is applied to optimize the HAAR wavelet Transform. Figure 5.38

represents the segmentation of Time Series Data using decomposition technique

after that it is fed to CS optimization and then further to neural network (NN). The

four decomposed segments are represented through different colors such as Red

(LL), blue (LH), black (HL) and green (HH).

ANN is an efficient and successful alternative to ARIMA models for predicting the

time series relation with distinctive features.

h hidden layer activation function of hidden layer

Algorithm: Artificial Neural Network (ANN)

16 Start

17 Initialize ANN and define the basic feature as input/training data (T-Data),

Target (TR) and Neurons (N)

18 Set, Model-Net = Newff (T-Data, TR, N)

19 Model -Net.TrainParam.Epoch = 1000

20 Model -Net.Ratio.Training = 70%

105

21 Model -Net.Ratio.Testing = 15%

22 Model -Net.Ratio.Validation = 15%

23 Model -Net = Train (Model -Net, T-Data, TR)

24 Current Data = Feature of real-time data

25 Prediction = simulate (Model -Net, Current Data)

26 If Prediction = True

27 Results = Show predicted data

28 End

29 Return: Results in terms of prediction

30 End

There are some important steps that are utilized to get the result or to train the

network as defined below one after another;

i. Design of network structure: In the structural design of the network includes

a various input layer, hidden layers,and output layers.

ii. A number of hidden layers: The hidden layer is utilized to analyze the

problem and provide the best solution for that particular problem.

Theoretically, defined the hidden layer involving total numbers of neurons is

utilized for a particular task. In most cases, more than one hidden layer is

utilized to get an efficient result.

iii. The amount of output nodes usually depends on the parameter of a number

of the input nodes.

iv. Evaluation criteria: The most used error function could be produced easily is

the total of squared error. There are also a few of the error functions that

could be generated by distinct methods named as the least absolute deviation.

v. Training of Neural Network(NN): To provide training for a neural network,

it's necessary to learn various patterns of data. That data must be involve

training from an accuratesolution to produce an effective learning process.

The network is trained to overcome the globally maximum by pointing the

enormous set of weights among neurons.

106

Figure 5:39 Training Structure of ANN

The training process of load forecasting using TSD taken from PSPCL is provided

in figure 5.39. This figure is composed of four specified panels from top to bottom

in the neural network (NN) training tool named (a) Neural Network, (b) Algorithms,

(c) Progress, and (d) Plots. All these panels are explained below in detail.In the

training structure of ANN, the Neural Network (NN) composed of three units, such

as input followed by hidden layers, and the final unit is the output layer. According

to the given figure, only one neuron is passed to the input layer; 19 neurons are

added in the hidden layer and produce 20 neurons. The obtained 20 neurons are

107

transferred to the output layer, where neurons are subtracted through weight value

(w) and bias value (b). Then the total number of neurons obtained on the output

layer is 1.

The utilized algorithm by ANN for training purposes is listed inside the algorithms

section. The division of data is done in a random way; an algorithm used for training

is named levenberg- Marquardt. The performance of this training algorithm is

measured in terms of mean Square Error (MSE), and calculations are done in the

form of MEX. The training progress is shown under progress panel and measured in

terms of certain parameters named as epoch (3rd iterations), time, performance is

depicted next, gradient, mutation (mu), and validation checks.As per these discussed

parameters of training progress, if one of them is completed, the training of the

system is done.

Figure 5:40 Performance

The performance of the training algorithm is measured in terms of Mean Square

Error (MSR) is given in figure 5.40. The training process in load forecasting data is

completed for 1stiterations. In this figure, the MSE with respect to the epochs is

varied from 0 to 3, corresponds to the x-axis. On the y-axis, the variation of MSE is

108

from 10-2to 105. As per the graph plotted, it is seen that there are four different lines

represented through distinctcolors such as blue, green, red, and dotted line, and each

line correspondingly denotes the training, validation, test value, and best value of

ANN. At the 1st iteration, the best validation score is obtained 1816.7979 at epoch 1.

Figure 5:41Training State

The training state of the ANN classifier is representing in graphical form in Figure

5.41. The depicted waveforms it is obtained after the completion of the training

process and contains waveform named gradient, mutation, and validation checks

obtained correspondingly are 5.035e-08, 1e-06, and 2 for a maximum of 3 epochs.

109

Figure 5:42 Regression

The regression parameter is used to validate the performance of the proposed work

i.e., the electricity consumption of Punjab state in the future. The graph in Figure

5.42 represents the network output along with targets analyzed for training,

validation, and test data. Initially, the regression value is obtained 0.22608

corresponds to training. For validation andtesting, the obtained value of regression is

0.006307 and 0 correspondingly. The final value of regression is 0.1426, by which it

concluded that the presented solution provides a better architecture of the training

set and best suitable for the classification.

110

Figure 5:43 Differencing Applied on non-stationary LL data segment of Cuckoo-NN

The segmented data obtained using Haar transforms is optimized using CS approach

and then classified using NN approach. After this, differencing data is performed on

the data to obtain stable information. From the graph, more stability has been seen as

compared to the previous results obtained without the application of CS and ANN

approach. The data obtained after the differencing technique applied on the LL data

segment of Cuckoo-NN, LH data segment of Cuckoo-NN, HL data segment of

Cuckoo-NN, and HH data segment of Cuckoo-NN are depicted in the Figure 5.39,

Figure 5.43, Figure 5.44, and Figure 5.45 respectively.

Figure 5:44 Differencing Applied on non-stationary LH data segment of Cuckoo-NN

111

In figure 5.44 the differencing value of consumed electricity on non-stationary LH-

segment of data after application of cuckoo combined with neural network is shown.

The three different colored curves i.e. black, red, and green curve show the original

consumed, rendered and adjacent rendered electricity. It is clear from the graph that

the rendered value of electricity is not much differing from the original consumed

value of electricity.

Figure 5:45 Differencing Applied on non-stationary HL data segment of Cuckoo-NN

Figure 5.45 represents the differencing value of electricity of HL-segment after

applying cuckoo with NN on non stationary data. According to the figure, the black

colored curve is far away from red as well as green colored curve. That means the

originally consumed electricity is varying with a large amount of difference from

rendered as well as adjacent rendered electricity value.

112

Figure 5:46 Differencing Applied on non-stationary HH data segment of Cuckoo-NN

Figure 5.46 shows the obtained differencing value of the HH data segment by

applying cuckoo with a neural network. As per the figure, increase in the number of

entries increases the difference between electricity value.

Figure 5:47 ARIMA applied on Cuckoo-NN optimized LL segment for segment based predictions

The prediction results using CS with ANN in addition to ARIMA model for four

different decomposed band of the test data (LL, LH, HL, and HH) is shown in

Figure 5.48, Figure 5.49, Figure 5.50, and Figure 5.51 respectively. From all the

prediction results it has been concluded that best data prediction has been obtained

using the HL band followed by the HH band.

113

Figure 5:48 ARIMA applied on Cuckoo-NN optimized LH segment for segment based predictions

Figure 5.48 represents the segment based predictions of the LH data segment after

applying the ARIMA model with cuckoo search and NN. Obtained predictions and

number of readings are plotted in vertical and horizontal axis respectively.

Figure 5:49 ARIMA applied on Cuckoo-NN optimized HL segment for segment-based predictions

Figure 5.49 represents the optimized HL-segment for segment-based predictions.

This is obtained by applying the ARIMA model on combining the technique of CS

and NN.

114

Figure 5:50 ARIMA applied on Cuckoo-NN optimized HH segment for segment based predictions

Figure 5.50, shows the segment-based predictions of HH-segment after applying the

ARIMA model on the hybrid model obtained after combining the Cuckoo Search

and NN. The red colored and black curve correspondingly represents the original as

well as applied ARIMA.

In this research work, the result has been computed in MATLAB simulator with the

real-time data obtained from PSPCL. The actual dataset of electricity consumption

covers the period from January 2013 to December 2017. The year-wise comparison

of actual v/s predicted electricity consumption through different techniques namely

ARIMA, ARIMA with DWT (Haar and Daubechies), ARIMA with ABC and ANN,

ARIMA with DWT, CSA and ANN consumption for consecutive years is provided

below in the table as well as in the graph form.

5.2.1 Prediction using ARIMA Model

In this section, the actual as well as predicted consumption of electricity by using

only the ARIMA model is given below.

Table 5.1 Original and predicted electricity consumption using ARIMA

Base year 2013

Actual Electricity Consumption

Predicted Electricity Consumption ARIMA

Jan-13 2402929222 2605929243

Feb-13 2302632737 2512932937

115

Mar-13 2302632737 2512872739

Apr-13 2315442012 2517472037

May-13 2763422169 2953422175

Jun-13 3409077307 3618069309

Jul-13 4283260815 4458260827

Aug-13 4203919034 4404119058

Sep-13 4320843017 4541942132

Oct-13 3284669791 3480639189

Nov-13 2660633086 2819643092

Dec-13 2510323489 2723536491

In table 5.2 provides the actual as well as the predicted consumption of electricity

for year 2013. The average value of actual energy consumption is 3063315451, and

the average value of predicted energy consumption is 3262403269.

Figure 5:51 Electricity consumption Actual and Predicted using ARIMA of the Year 2013

Figure 5.51 shows the original and predicted values of electricity consumption

using ARIMA. In this figure, X-axis depicts the months of year (2013), and Y-axis

depicts Energy Consumption (KWh). The Blue line shows the actual electricity

consumption, and Red line shows the predicted electricity consumption using

ARIMA.

116

Table 5.2 Actual and predicted electricity consumption using ARIMA

As per the given table 5.3, the average value of actual and predicted energy

consumption is 3312955331 and 3515040238 respectively.


Figure 5.52 shows the original and predicted values of electricity consumption

using ARIMA. In this figure, X-axis depicts the months of year (2014), and Y-axis

depicts Energy Consumption (KWh). The Blue line shows the actual electricity

consumption, and Red line shows the predicted electricity consumption using

ARIMA.

Base year 2014


Predicted Electricity Consumption

ARIMA Jan-14 2406670480 2618670409 Feb-14 2543659240 2737659298 Mar-14 2262755935 2471756951 Apr-14 2229496325 2479576319 May-14 2663094360 2813094134 Jun-14 4135881561 4315821563 Jul-14 5048078068 5248078065

Aug-14 5150618207 5350618211 Sep-14 4184285964 4384285966 Oct-14 3525215760 3725215762 Nov-14 2914127320 3114127323 Dec-14 2691580750 2891580753

117

Table 5.3 Original and predicted electricity consumption using ARIMA of the year 2015

Base year 2015



Jan-15 2425671683 2625671685

Feb-15 2705001417 2905001419 Mar-15 2387522949 2587522951

Apr-15 2253062900 2553062903

May-15 2884345926 3184345929

Jun-15 3461754270 3661754272 Jul-15 4895975328 5095975330

Aug-15 5076894389 5276894391

Sep-15 4867004448 5067004450 Oct-15 3787014752 3987014754

Nov-15 2804034438 3004034440

Dec-15 2719630627 2919630629

As per the given table 5.4, the average value of actual and predicted energy

consumption is 3355659427 and 3572326096 respectively.

Figure 5.53 shows the original and predicted values of electricity consumption using

ARIMA. X-axis depicts the months of year 2015, and Y-axis depicts the Energy

Consumption in KWh. The Blue line shows the original electricity consumption, and

Red line shows the predicted electricity consumption.

Figure 5:53 Electricity consumption actual and predicted using ARIMA for Year 2015

118

Table 5.4 Original and predicted electricity consumption using ARIMA for year 2016

Base year 2016



Jan-16 2567913778 2767913779

Feb-16 2740311321 2940311325 Mar-16 2942162298 3142162300

Apr-16 2522244918 2722244920

May-16 3253239971 3453239973

Jun-16 4500446696 4700446698 Jul-16 5018252044 5318252044

Aug-16 4999539492 5199539494

Sep-16 5404195697 5604195699 Oct-16 3958725442 4158725444

Nov-16 3130248433 3330248436

Dec-16 2921627225 3121627227

Figure 5:54 Electricity consumption Actual and Predicted using ARIMA for year 2016

Figure 5.54 shows the Electricity consumption actual and predicted using ARIMA.

In this figure, X-axis depicts the year (2016), and Y-axis depicts Energy

Consumption (KWh). The Blue line shows the original Electricity Consumption, and

Red line shows the Predicted Electricity Consumption using ARIMA. The average

value of Actual Energy Consumption is 3663242276, and the average value of

Predicted Energy Consumption is 3871575612.

119

Table 5.5 Actual and predicted electricity consumption using ARIMA of the year 2017

Base year 2016 Actual Electricity Consumption


Jan-17 2902848056 3102848058

Feb-17 2958390111 3158390114 Mar-17 3209855845 3409855847

Apr-17 2782373785 2982373787

May-17 3705371880 3905371882

Jun-17 4450434098 4650434100 Jul-17 5810138389 6010138391

Aug-17 5645879830 5845879832

Sep-17 5273395962 5473395964

Oct-17 4196118613 4396118616 Nov-17 3191950404 3391950406

Dec-17 3030487191 3230487193

Table 5.6 shows the Electricity consumption original and Predicted using ARIMA.


In figure 5.55 X-axis depicts the year (2017), and Y-axis depicts Energy

Consumption (KWh). The Blue line shows the actual electricity consumption, and

Red line shows the predicted electricity consumption using ARIMA. The average

value of actual electricity consumption is 3929770347, and the average value of

predicted energy consumption is 4129770349.

120

5.2.2 Prediction using ARIMA with DWT

The following section provides the predicted values of actual as well as expected

electricity consumption using ARIMA with DWT. The corresponding tables and

graphs are explained below year wise.

Table 5.6 Original and predicted electricity consumption using ARIMA with DWT for year 2013

Base year 2013



Jan-13 2402929222 2512927412 Feb-13 2302632737 2412832137

Mar-13 2302632737 2512432431

Apr-13 2315442012 2503142081

May-13 2763422169 2613424165 Jun-13 3409077307 3318076315

Jul-13 4283260815 4363260314

Aug-13 4203919034 4318619035

Sep-13 4320843017 4419843015

Oct-13 3284669791 3324569175

Nov-13 2660633086 2725243173

Dec-13 2510323489 2631673474

As per the given table 5.7, the average value of Actual Energy Consumption is

3063315451, and the average value of Predicted ARIMA with DWT is 3138003561.

There is not much difference between the actual and predicted energy consumption

with ARIMA and DWT.

Figure 5:56 Electricity consumption original and Predicted using ARIMA with

DWT of the Year 2013

121

Figure 5.56 shows the predicted values of actual and predicted electricity

consumption using ARIMA and DWT. In this figure, X-axis depicts the year (2013),

and Y-axis depicts Energy Consumption (KWh). The Blue line shows the actual

Electricity Consumption, and Red line shows the Predicted ARIMA with DWT.

Table 5.7 Actual and predicted electricity consumption using ARIMA and DWT for year 2014



Jan-14 2406670480 2575279317

Feb-14 2543659240 2637653215

Mar-14 2262755935 2358743187 Apr-14 2229496325 2379496431

May-14 2663094360 2713719384

Jun-14 4135881561 4215633691

Jul-14 5048078068 5186154157 Aug-14 5150618207 5274619108

Sep-14 4184285964 4251293903

Oct-14 3525215760 3405615215

Figure 5:57 Electricity consumption Actual and Predicted using ARIMA with DWT for the Year 2014

122

Figure 5.57 shows the Electricity consumption values of Ariginal and Predicted

using ARIMA with DWT. X-axis depicts months of the year (2014), and Y-axis

depicts Energy Consumption in KWh. The Blue line shows the Actual Electricity

Consumption and Red line shows the Predicted ARIMA with DWT. The average

value of Actual Energy Consumption is 3312955331 and Predicted value using

ARIMA with DWT is 3401279918. There is not much difference between the

Actual and Predicted Energy Consumption using ARIMA with DWT.

Table 5.8 Actual and predicted electricity consumption using ARIMA with DWT of the year 2015

Base year 2015



Jan-15 2425671683 2619535715 Feb-15 2705001417 2815041757

Mar-15 2387522949 2413729279

Apr-15 2253062900 2315762148

May-15 2884345926 2915435928

Jun-15 3461754270 3557148373

Jul-15 4895975328 4935971351

Aug-15 5076894389 5176894383

Sep-15 4867004448 4961374389

Oct-15 3787014752 3817214768

Nov-15 2804034438 2915036481

Dec-15 2719630627 2808931643

123


As given in figure 5.58, for year 2015 the average value of Actual Energy

Consumption is 3355659427, and the average value of Predicted values using

ARIMA with DWT is 3437673018. There is not much difference between the

Actual Energy Consumption and Predicted values using ARIMA with DWT.

Table 5.9 Actual and predicted electricity consumption using ARIMA with DWT for year 2016

Base year 2016



Jan-16 2567913778 2618313239

Feb-16 2740311321 2841312743

Mar-16 2942162298 3012963215

Apr-16 2522244918 2624247980 May-16 3253239971 3354231954

Jun-16 4500446696 4710647696

Jul-16 5018252044 5177523519

Aug-16 4999539492 5039731427 Sep-16 5404195697 5514396712

Oct-16 3958725442 4091795405

Nov-16 3130248433 3291248174

Dec-16 2921627225 3021523257

Table 5.10 shows the actual as well predicted electricity consumption values by

using ARIMA with DWT.

124

Figure 5.59 shows the Actual and Predicted Electricity consumption values using

ARIMA with DWT in graph form. In this figure, X-axis depicts the year (2016), and

Y-axis depicts Energy Consumption (KWh). The Blue line shows the Actual

Electricity Consumption, and Red line shows the Predicted consumption using

ARIMA with DWT. The average value of Actual Energy Consumption is

3663242276 and the average value of Predicted ARIMA with DWT is 3774827943.

There is not much difference between the Actual Energy Consumption and Predicted

ARIMA with DWT.


Figure 5.60 shows the Electricity consumption original and Predicted using ARIMA

with DWT. In this figure, X-axis depicts the year (2017), and Y-axis depicts Energy

Consumption (KWh). The Blue line shows the original Electricity Consumption, and

Red line shows the Predicted Arima with DWT. The average value of Original

Energy Consumption is 3929770347, and the average value of Predicted Arima with

DWT is 4034206349 as shown in table 5.11. There is no much difference between

the original Energy Consumption and Predicted ARIMA with DWT.

125

Table 5.10 Actual and Predicted electricity consumption using ARIMA with DWT for the year 2017

Base year 2017



Jan-17 2902848056 3091431548

Feb-17 2958390111 3028196475 Mar-17 3209855845 3318257943

Apr-17 2782373785 2812353193

May-17 3705371880 3819365814

Jun-17 4450434098 4518914219 Jul-17 5810138389 5943118371

Aug-17 5645879830 5748170817

Sep-17 5273395962 5384295867 Oct-17 4196118613 4385918643

Nov-17 3191950404 3201805432

Dec-17 3030487191 3158647865

Figure 5:60 Electricity consumption original and Predicted using ARIMA with DWT for the Year 2017

5.2.3 Prediction using the Proposed Hybrid Model

In the previous section, two different techniques used with the ARIMA forecasting

model have been provided. The predicted values by using ARIMA is variying

126

largely from actual electricity consumption. But when using ARIMA with DWT is

utilized for decomposition purposes the expected obtained electricity consumption

value is nearest to the actual consumption value as compared to the ARIMA model.

After analyzing the predicted value of these techniques this sub-section discusses the

year wise explanations of actual and expected electricity consumption starting from

January 2013 to December 2017. Here, focus is on forecasting the predicted values

using integrated mechanism of CSA and ANN along with HAAR wavelet

decomposition technique followed by ARIMA model, and the discussion is given in

table form as well graphically below.

Table 5.11 Actual and predicted electricity consumption using Proposed Hybrid Model for the year 2013

Base year 2013



Jan-13 2402929222 2403030111

Feb-13 2302632737 2302834848

Mar-13 2302632737 2302432649

Apr-13 2315442012 2317842088 May-13 2763422169 2778422786

Jun-13 3409077307 3409098654

Jul-13 4283260815 4283234576

Aug-13 4203919034 4203998778 Sep-13 4320843017 4320897867

Oct-13 3284669791 3284635543

Nov-13 2660633086 2660667643

Dec-13 2510323489 2510368943

In table 5.12, the actual electricity consumption values for the year 2013 is provided.

The corresponding expected values for the electricity consumption is also provided.

It is clear from the obtained expected values; the predicted values and the actual

values are approximately close to each other. However, the actual electricity

consumed by users was about 3063315451. Consumers using the proposed hybrid

model are expected to consume approximately 3064788707 electricity. The original

and predicted value falls into the same range with a small difference.

127

Figure 5:61 Actual and Predicted values of electricity consumption using Proposed Hybrid Model for the Year 2013.

The graphical representation of actual electricity consumption by consumers for the

year 2013 is given in figure 5.61. Blue and red curves in the graph represents actual

and predicted consumption of electricity by using the hybrid proposed model. There

is very little difference in actual and predicted consumed electricity, so the actual

curve is overlapping the predicted electricity value. The graph represents the

consumed electricity in KWh and year along vertical and horizontal axis.

Table 5.12 Original and predicted electricity consumption using Proposed Hybrid Model for the year 2014



Jan-14 2406670480 2406632260

Feb-14 2543659240 2543624804

Mar-14 2262755935 2236279532

Apr-14 2229496325 2272945793

May-14 2663094360 2663776436

Jun-14 4135881561 4135897042

Jul-14 5048078068 5048008640

Aug-14 5150618207 5150087643

Sep-14 4184285964 4184295425

Oct-14 3525215760 3525243579

Nov-14 2914127320 2914325679

Dec-14 2691580750 2691535689

128

Table 5.13 depicts the actual as well as predicted consumed electricity for the year

2014. The average of actual consumed electricity is 3312976254. The average of

consumed predicted electricity value is 3314387710. There is insignificant variation

between the actual and predicted value of electricity.

Figure 5.62 represents the consumption of original as well as the predicted

electricity value. Along Y-axis and X-axis correspondingly represent the electricity

consumption and the year 2014 from January to December. The Blue and Red-

colored curves in the graph show the actual and predicted value of electricity.

Figure 5:62 Actual and Predicted values of electricity consumption using Proposed Hybrid Model for the Year 2014.


Base year 2015



Jan-15 2425671683 2425635790

Feb-15 2705001417 2700245689

Mar-15 2387522949 2308754670

Apr-15 2253062900 2253035678

May-15 2884345926 2884309875

Jun-15 3461754270 3461733256

Jul-15 4895975328 4859345678

Aug-15 5076894389 5007646789

Sep-15 4867004448 4867009865

Oct-15 3787014752 3770567899

129

Nov-15 2804034438 2804056789

Dec-15 2719630627 2719634567

In table 5.14 the actual and predicted values for consumption of electricity for the

2015 are shown. The average actual electricity consumption is 3355659427. The

obtained average predicted electricity consumption is 3338498054. Hence there is

no significant amount of difference among these actual and predicted values of

electricity consumption.

Figure 5:63 Electricity consumption Actual and Predicted using Proposed Hybrid Model for the Year 2015

The actual consumed electricity by the consumer for the year 2015 is provided in

figure 5.63. A blue-colored curve denotes the actual value of consumed electricity,

and the expected value is depicted through the red-colored curve. In the graph, the

actual value is overlapped with the expected values of consumed electricity because

of very little difference among these values.

130


Base year 2016



Jan-16 2567913778 2567945678

Feb-16 2740311321 2740310963

Mar-16 2942162298 2904213480

Apr-16 2522244918 2522123456

May-16 3253239971 3253234568

Jun-16 4500446696 4500412997

Jul-16 5018252044 5020932568

Aug-16 4999539492 4999500532

Sep-16 5404195697 5400412340

Oct-16 3958725442 3905872349

Nov-16 3130248433 3103020975

Dec-16 2921627225 2921606532

Table 5.15 represents the actual and predicted value of consumed electricity for the

year 2016.. The average value corresponding to actual electricity consumption is

3663242276 and for predicted consumption of electricity the average value is

3672383639.

Figure 5:64 Electricity consumption Actual and Predicted using Proposed Hybrid Model for the Year 2016

The graphical representation for the year 2016 is given in figure 5.64. The electricity

consumption (KWh) is plotted along Y-axis against the years 2016 corresponds to

131

X-axis. While blue color is representing the consumed electricity and the red color

curve corresponds to the expected consumed electricity value.


Base year 2017



Jan-17 2902848056 2902832345

Feb-17 2958390111 2905898637

Mar-17 3209855845 3218576445

Apr-17 2782373785 2723987754

May-17 3705371880 3705344790

Jun-17 4450434098 4404334578

Jul-17 5810138389 5801010875

Aug-17 5645879830 5645898752

Sep-17 5273395962 5273897987

Oct-17 4196118613 4196889768

Nov-17 3191950404 3109112390

Dec-17 3030487191 3004009043

Table 5.16 represents the actual and predicted consumed electricity that corresponds

to the year 2017.

Figure 5:65 Electricity consumption original and Predicted using Proposed Hybrid Model of the Year 2017

132

The obtained average value for actual electricity, as well as predicted electricity

consumption is 3929753864 and 3907684235. The consumption of electricity value

of actual and predicted value is represented graphically in figure 5.65. The blue

colored curve depicts the original consumed electricity value and a red-colored

curve represents the predicted consumed electricity value.

Figure 5:66 Overall comparison of electricity consumption prediction of ARIMA

+DWT, Hybrid proposed model with the original dataset

Figure 5.66 depicts the trend between the actual and predicted electricity

consumption which follows the same trend. The is no much difference depicted

between the original and predicted consumption. The graph clearly shows that both

graphs follow the same trend. However, prediction using the ARIMA and DWT

provides better results than the ARIMA only. The average computed value of the

ARIMA, ARIMA with DWT and proposed hybrid model is 3464988567,

3557198158, and 3455724556. Thus, the average difference between the original

and by using ARIMA with the DWT model is 3%, and that of the proposed model is

0.39%. Thus, our proposed hybrid model using ARIMA, CSA, DWT and ANN

provides better results as compared to both only ARIMA and ARIMA with DWT

approach.

At the end, to examine the performance of this research work,Mean Absolute

Percentage Error(MAPE), Mean Average Precision (MAP), and Accuracy (%) are

utilized as an evaluation parameter. Each one of them are explained below:

133

Mean Absolute Percentage Error (MAPE)

MAPE is a measure to compute the amount of dependent series that varies from its

level of the predicted model. This parameter is independent of units and can so it can

be used to compare series with distinct units. MAPE demonstrates the performance

in the percentage of the error, and it can be expressed mathematically as;

, denotes the actual sequence, Forecasted electricity values and P

represents the number of samples.

Mean Average Precision (MAP)

The parameter precision means more than two values of the measurements that are

close to each other. The precision value is different due to the prediction error.

Higher precision depicts the result measurement is constant, and low precision

depicts the varying measurement. But all time is not necessary; the higher precision

produces an enhanced result. The mathematical expression to compute precision is

provided below;

Where, TP = True positive, FP= False Positive

Mean average precision for any collection of electricity consumption datasets is

defined as the mean of the average precision scores for every corresponding data by

applying different techniques. The mathematical expression of MAP by using

average precision is provided below;

134

Where denotes the number of desired samples, is the number of retrieved

samples, corresponds to the average of precision at level .

Accuracy (%)

The ability of the system to measure accurate value means it defines the closeness

for the measured value to a true value. The computation of accuracy can be done by

using the small reading through which the error is reduced. The accuracy can be

defined mathematically as provided below;

5.2.4 Computed parameters

To analyze the prediction performance of various models including the proposed

hybrid model and to prove the effectiveness of the proposed model the results of

predicted electricity consumption has been presented in the table 5.17. The

parameters such as MAP (Mean Average Precision), MAPE (Mean Absolute

Percentage Error) and accuracy has been utilized to analyze the performance.

Table 5.16 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model

MODEL MAP

ARIMA 0.4623940

ARIMA with DWT 0.894494

Proposed Hybrid Model 0.94521

Table 5.17 depicts the various techniques used in proposed work along with the

examined MAP values. The graphical representation of the same is shown in Figure

5.67. There is an increase of accuracy by 5.67 % compared to the ARIMA with the

DWT model.

135

Figure 5:67 Computed MAP for ARIMA, ARIMA with DWT and Proposed Hybrid Model

From the graph, it is clearly seen that the maximum MAP of 0.94521 has been

attained using the hybrid approach.

Table 5.17 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model

MODEL MAPE

ARIMA 44.239359

ARIMA with DWT 26.438503


Table 5.18 depicts the MAPE corresponding to various techniques used in the

proposed work. The Proposed Hybrid Model shows the lowest Mean Average

Percentage Error when compared with other techniques.

Figure 5:68 Computed MAPE for ARIMA, ARIMA with DWT and Proposed Hybrid Model

136

The graphical representation of the MAPE is shown in figure 5.68. Y-axis is

depicting the obtained MAPE value corresponding to ARIMA, ARIMA with DWT,

and the Proposed Hybrid Model. As per the reduction in MAPE, the performance of

the system has been enhanced. The MAPE of the proposed hybrid model is lowest.

After applying ARIMA with the DWT technique, there is a decrease of 17.800856

as compared to utilizing only ARIMA. Whereas applying Proposed Hybrid Model,

there is a decrease of 17.493558 as a contrast to ARIMA with DWT.

Table 5.18 Computed Accuracy for Different Combinations

Used techniques Accuracy (%)

ARIMA 83.53

ARIMA with DaubechiesWavelet 92.67

ARIMA with HAAR Wavelet 93.76


Table 5.19 depicts the various techniques used in work. The ARIMA technique

shows the Accuracy of 83.53%, ARIMA with the DaubechiesWavelet shows the

accuracy of 92.67%, ARIMA with HAAR Wavelet the accuracy of 93.76 % and

Proposed Hybrid Model technique shows the Accuracy of 98.86%.

Figure 5:69 Computed Accuracy (%) for ARIMA, ARIMA with DWT, ARIMA with HAAR and Proposed Hybrid Model

137

After applying ARIMA and ARIMA with the DWT technique, there is an increase

of 4.6 % accuracy as shown in figure 5.69. Whereas applying Proposed Hybrid

Model there is an increment in accuracy by 1.15% as compared to ARIMA with

DWT.

Thus there is an increase in the accuracy of the proposed work of 18.35 %, 6.68 %,

and 5.44 % from the ARIMA, ARIMA with Daubechies, and ARIMA with Haar

techniques respectively.

138

CHAPTER 6: CONCLUSION AND FUTURE SCOPE

6.1 CONCLUSION

This research focuses on design and development of a novel intelligent technique

which can be used to study the future behaviour of electricity consumption on the

basis of Time Series Data. The analysis of future behaviour in relation to very

sudden changes in the time series data of consumed electricity is very complex and

major challenge for the electricity providers and investors as well. However, the

benefits associated with accurate forecasting have prompted researchers to develop

new and advanced models.

Predicting the next values of time series has been a major research problem that

attracts researchers from numerous fields. In this research, short-term forecasting is

studied in one step ahead and many steps ahead modeling and three forecasting

models are compared, namely, ARIMA, ARIMA with DWT and Proposed hybrid

model. Time series from different applications generally consists of both linear and

non linear variations. Linear ARIMA and ARIMA with DWT models cannot

accurately model this data separately. A hybrid model is proposed, which is an

integration of individual models such as ARIMA, DWT, CSA and ANN. By taking

the advantages of all these techniques a novel model is designed with high

prediction accuracy.

The first one or the basic model is ARIMA with DWT, which was proposed in this

research by using the statistical features of ARIMA model. The MA filter has been

utilized to decompose the available time series electricity data into two data sets that

consists of lower and upper level of data, which was later used to forecast the data

obtained using hybrid model. This hybrid model is able to predict electricity

consumption at the earliest. The designed model was applied to simulate Time

Series Data and forecast electricity consumption in Punjab State.

The proposed hybrid model using Discrete Wavelet Transformation (DWT), Cuckoo

Search (CS) algorithm and Artificial Neural network (ANN) was the final one,

139

which is used to estimate and forecast electricity demand/consumption using a

stochastic process. In the proposed electricity forecasting model, the wavelet

transform technique has been applied to reduce the white noise present in the

original dataset taken from PSPCL from 2013 to the 2017 year and hence obtained a

more stable dataset compared to the original dataset. DWT is applied as wavelet

transformation, which decomposes a signal into an essential orthogonal function of

different frequencies. The main feature of DWT is that it is totally lossless

transformation. We can regain our original signal while using reverse DWT. The

electricity data is non-stationary as its consumption varies continuously over time.

Therefore, DWT is the best way to express this type of data. This is done to forecast

highly accurate value by using a simple technique named as ARIMA model. The

work has been performed using two combinations of wavelets that are Daubechies

and HAAR wavelet. After analysis, it has been observed that prediction using the

HAAR wavelet provides better results compared to the Daubechies approach.

Therefore, HAAR has been selected as a wavelet decomposition approach, and then

ARIMA, CS with ANN has been applied to enhance the prediction performance of

the proposed work.

The comparison of predicted energy consumption values as well as the MAP,

MAPE, and Accuracy (%) of the proposed model have been compared with the

traditional approachs. The experimental values show that the predicted values using

the proposed model are highly correlated with the original dataset, which indicates

that the designed model is efficient and highly accurate to predict electricity

consumption. Thus, the increase in the accuracy by proposed model is 10.23 %

when compared with ARIMA, 6.19 % when compared with ARIMA and

Daubechies and 5.1 % with ARIMA and HAAR respectively.

140

6.2 FUTURE SCOPE

In future this work can be extended using other traditional classifiers such as

Support Vector Machine (SVM), Fuzzy Logic, Convolutional Neural Network and

other techniques. Also, for data optimization other techniques such as Genetic

Algorithm (GA), Particle Swarm Optimization (PSO), Firefly algorithms can be

used.

Here, MA filter is used as pre-processing scheme using ARIMA model. Other

existing pre-processing methods that were presented in literature can also be

experimentally validated for better prediction accuracy. An appropriate pre-

processing method can also be properly designed to increase forecast accuracy to

adapt to a well forecasting model.

The model can be applied for other statistical analysis which can be used to predict

various time series data i.e. live-stock product, agricultural yield, health expenditure,

currency exchange rate and many more.

141

RESEARCH PUBLICATIONS

1. Kaur, H. and Ahuja, S., 2017. Time series analysis and prediction of electricity consumption of health care institution using ARIMA model.. Advances in Intelligent Systems and Computing, vol 547.Springer, Singapore

2. Kaur, H. and Ahuja, S. (2019). A Hybrid Arima and Discrete Wavelet Transform Model for Predicting the Electricity Consumption of Punjab. International Journal of Innovative Technology and Exploring Engineering, 8(11), pp.1915-1919

3. Kaur, H. and Ahuja, S. (2019). SARIMA Modelling for Forecasting the Electricity Consumption of a Health Care Building. International Journal of Innovative Technology and Exploring Engineering, 8(12), pp.2795-2799.

142

REFERENCES

[1] Agnetis, A., De Pascale, G., Detti, P., &Vicino, A. (2013). Load scheduling for household energy consumption optimization. IEEE Transactions on Smart Grid, 4(4), 2364-2373.

[2] Pérez-Lombard, L., Ortiz, J., & Pout, C. (2008). A review on buildings energy consumption information. Energy and buildings, 40(3), 394-398.

[3] Taylor, J. W., McSharry, P. E., &Buizza, R. (2009). Wind power density forecasting using ensemble predictions and time series models. IEEE Transactions on Energy Conversion, 24(3), 775-782.

[4] Fu, T. C. (2011). A review on Time Series Data mining. Engineering Applications of Artificial Intelligence, 24(1), 164-181.

[5] Butcher, J. B., Verstraeten, D., Schrauwen, B., Day, C. R., & Haycock, P. W. (2013). Reservoir computing and extreme learning machines for non-linear time-series data analysis. Neural networks, 38, 76-89.

[6] Turchin, P. (1993). Chaos and stability in rodent population dynamics: evidence from non-linear time-series analysis. Oikos, 167-172.

[7] Brahim-Belhouari, S., &Bermak, A. (2004). Gaussian process for nonstationary time series prediction. Computational Statistics & Data Analysis, 47(4), 705-712.

[8] Fan, S., & Hyndman, R. J. (2010, December). Forecast short-term electricity demand using semi-parametric additive model. In 2010 20th Australasian Universities Power Engineering Conference (pp. 1-6).IEEE.

[9] Willis, H., &Aanstoos, J. (1979).Some unique signal processing applications in power system planning. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(6), 685-697

[10] Reisen, V. A., & Lopes, S. (1999). Some simulations and applications of forecasting long-memory time-series models. Journal of Statistical Planning and Inference, 80(1-2), 269-287.

[11] Landassuri-Moreno, V. M., Bustillo-Hernández, C. L., Carbajal-Hernández, J. J., & Fernández, L. P. S. (2013, November). Single-step-ahead and multi-step-ahead prediction with evolutionary artificial neural networks. In Iberoamerican Congress on Pattern Recognition (pp. 65-72). Springer, Berlin, Heidelberg.

[12] Hansen, J. V., & Nelson, R. D. (2003). Forecasting and recombining time-series components by using neural networks. Journal of the Operational Research Society, 54(3), 307-317.

[13] Ho, S. L., Xie, M., & Goh, T. N. (2002). A comparative study of neural network and Box-Jenkins ARIMA modeling in time series prediction. Computers & Industrial Engineering, 42(2-4), 371-375.

[14] Fister, I., Yang, X. S., &Fister, D. (2014). Cuckoo search: a brief literature review. In Cuckoo search and firefly algorithm (pp. 49-62). Springer, Cham.

[15] Roy, S., & Chaudhuri, S. S. (2013). Cuckoo search algorithm using Lévy flight: a review. international journal of Modern Education and Computer Science, 5(12), 10.

[16] Mareli, M., & Twala, B. (2018). An adaptive Cuckoo search algorithm for optimisation. Applied computing and informatics, 14(2), 107-115.

143

[17] Gandomi, A. H., Yang, X. S., &Alavi, A. H. (2013). Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems. Engineering with computers, 29(1), 17-35.

[18] Jiang, P., Liu, F., Wang, J., & Song, Y. (2016). Cuckoo search-designated fractal interpolation functions with winner combination for estimating missing values in time series. Applied Mathematical Modelling, 40(23-24), 9692-9718.

[19] Kim, M. K. (2015). Short-term price forecasting of Nordic power market by combination Levenberg–Marquardt and Cuckoo search algorithms. IET Generation, Transmission & Distribution, 9(13), 1553-1563.

[20] Ong, P., & Zainuddin, Z. (2019). Optimizing wavelet neural networks using modified cuckoo search for multi-step ahead chaotic time series prediction. Applied Soft Computing, 80, 374-386.

[21] Awan, S. M., Aslam, M., Khan, Z. A., & Saeed, H. (2014). An efficient model based on artificial bee colony optimization algorithm with Neural Networks for electric load forecasting. Neural Computing and Applications, 25(7-8), 1967-1978.

[22] Tealab, A., Hefny, H., & Badr, A. (2017). Forecasting of nonlinear time series using ANN. Future Computing and Informatics Journal, 2(1), 39-47.

[23] Yao, X. (1999). Evolving artificial neural networks. Proceedings of the IEEE, 87(9), 1423-1447.

[24] Dawson, C. W., & Wilby, R. L. (2001). Hydrological modelling using artificial neural networks. Progress in physical Geography, 25(1), 80-108.

[25] Hippert, H. S., Pedreira, C. E., & Souza, R. C. (2001). Neural networks for short-term load forecasting: A review and evaluation. IEEE Transactions on power systems, 16(1), 44-55.

[26] Paliwal, M., & Kumar, U. A. (2009). Neural networks and statistical techniques: A review of applications. Expert systems with applications, 36(1), 2-17.

[27] Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis. European journal of operational research, 116(1), 16-32.

[28] Qian, Z., Pei, Y., Zareipour, H., & Chen, N. (2019). A review and discussion of decomposition-based hybrid models for wind energy forecasting applications. Applied Energy, 235, 939-953.

[29] Zarnowitz, V., & Ozyildirim, A. (2006). Time series decomposition and measurement of business cycles, trends and growth cycles. Journal of Monetary Economics, 53(7), 1717-1739.

[30] Sang, Y. F. (2013). A review on the applications of wavelet transform in hydrology time series analysis. Atmospheric research, 122, 8-15.

[31] Rhif, M., Ben Abbes, A., Farah, I. R., Martínez, B., & Sang, Y. (2019). Wavelet transform application for/in non-stationary time-series analysis: a review. Applied Sciences, 9(7), 1345.

[32] Conejo, A. J., Plazas, M. A., Espinola, R., & Molina, A. B. (2005). Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE transactions on power systems, 20(2), 1035-1042.

[33] Kiplangat, D. C., Asokan, K., & Kumar, K. S. (2016). Improved week-ahead predictions of wind speed using simple linear models with wavelet decomposition. Renewable Energy, 93, 38-44.

[34] Nourani, V., Baghanam, A. H., Adamowski, J., & Kisi, O. (2014). Applications of hybrid wavelet–artificial intelligence models in hydrology: a review. Journal of Hydrology, 514, 358-377.

144

[35] Hou, Z., Makarov, Y. V., Samaan, N. A., & Etingov, P. V. (2013, January). Standardized Software for Wind Load Forecast Error Analyses and Predictions Based on Wavelet-ARIMA Models--Applications at Multiple Geographically Distributed Wind Farms. In 2013 46th Hawaii International Conference on System Sciences (pp. 5005-5011). IEEE.

[36] Nandanwar, L., &Mamulkar, K. (2015). Supervised, semi-supervised and unsupervised WSD approaches: An overview. International Journal of Science and Research (IJSR), 4(2), 1684-1688.

[37] Caruana, R., & Niculescu-Mizil, A. (2006, June). An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning (pp. 161-168).

[38] Ghahramani, Z. (2003, February). Unsupervised learning. In Summer School on Machine Learning (pp. 72-112). Springer, Berlin, Heidelberg.

[39] Huang, G., Song, S., Gupta, J. N., & Wu, C. (2014). Semi-supervised and unsupervised extreme learning machines. IEEE transactions on cybernetics, 44(12), 2405-2417.

[40] Chapelle, O., Scholkopf, B., &Zien, A. (2009). Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Transactions on Neural Networks, 20(3), 542-542.

[41] Zhu, X., & Goldberg, A. B. (2009). Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 3(1), 1-130.

[42] Morgan, M. G., & Talukdar, S. N. (1979). Electric power load management: Some technical, economic, regulatory and social issues. Proceedings of the IEEE, 67(2), 241-312.

[43] Mishra, S., & Singh, S. N. (2015, December). Indian electricity market: Present status and future directions. In 2015 IEEE UP Section Conference on Electrical Computer and Electronics (UPCON) (pp. 1-7). IEEE.

[44] Rallapalli, S. R., & Ghosh, S. (2012). Forecasting monthly peak demand of electricity in India—A critique. Energy policy, 45, 516-520.

[45] Bhargava, N., & Gupta, S. (2006). The Punjab state electricity board: past, present and future. Panjab University research Journal (Arts), 33(2), 93-104.

[46] Willis, H., &Aanstoos, J. (1979).Some unique signal processing applications in power system planning. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(6), 685-697.

[47] Fan, S., & Hyndman, R. J. (2010, December). Forecast short-term electricity demand using semi-parametric additive model. In 2010 20th Australasian Universities Power Engineering Conference (pp. 1-6).IEEE.

[48] Hong, T., Hsiang, S. M., & Xu, L. (2009, July). Human-machine co-construct intelligence on horizon year load in long term spatial load forecasting. In 2009 IEEE Power & Energy Society General Meeting (pp. 1-6). IEEE.

[49] Topalli, A. K., &Erkmen, I. (2003). A hybrid learning for neural networks applied to short term load forecasting. Neurocomputing, 51, 495-500.

[50] Ghiassi, M. D. K. Z., Zimbra, D. K., &Saidane, H. (2006). Medium term system load forecasting with a dynamic artificial neural network model. Electric power systems research, 76(5), 302-316.

[51] Carpinteiro, O. A., Leme, R. C., de Souza, A. C. Z., Pinheiro, C. A., & Moreira, E. M. (2007). Long-term load forecasting via a hierarchical neural model with time integrators. Electric Power Systems Research, 77(3-4), 371-378.

145

[52] Amjady, N., &Keynia, F. (2008).Mid-term load forecasting of power systems by a new prediction method. Energy Conversion and Management, 49(10), 2678-2687.

[53] Soares, L. J., & Medeiros, M. C. (2008).Modeling and forecasting short-term electricity load: A comparison of methods with an application to Brazilian data. International Journal of Forecasting, 24(4), 630-644.

[54] Pedregal, D. J., &Trapero, J. R. (2010). Mid-term hourly electricity forecasting based on a multi-rate approach. Energy Conversion and Management, 51(1), 105-111.

[55] Darbellay, G. A., &Slama, M. (2000). Forecasting the short-term demand for electricity: Do neural networks stand a better chance?. International Journal of Forecasting, 16(1), 71-83.

[56] El-Telbany, M., & El-Karmi, F. (2008).Short-term forecasting of Jordanian electricity demand using particle swarm optimization. Electric Power Systems Research, 78(3), 425-433

[57] Kandil, N., Wamkeue, R., Saad, M., & Georges, S. (2006, July). An efficient approach for shorterm load forecasting using artificial neural networks. In 2006 IEEE International Symposium on Industrial Electronics (Vol. 3, pp. 1928-1932).IEEE.

[58] Xiao, Z., Ye, S. J., Zhong, B., & Sun, C. X. (2009). BP neural network with rough set for short term load forecasting. Expert Systems with Applications, 36(1), 273-279.

[59] Catalão, J. P. D. S., Mariano, S. J. P. S., Mendes, V. M. F., & Ferreira, L. A. F. M. (2007). Short-term electricity prices forecasting in a competitive market: A neural network approach. Electric Power Systems Research, 77(10), 1297-1304.

[60] Goude, Y., Nedellec, R., & Kong, N. (2013). Local short and middle term electricity load forecasting with semi-parametric additive models. IEEE transactions on smart grid, 5(1), 440-446.

[61] Minaye, E., &Matewose, M. (2013).Long term load forecasting of Jimma town for sustainable energy supply. International Journal of Science and Research, 5(2), 1500-1504.

[62] Willis, H. L., & Romero, J. (2007).Spatial electric load forecasting methods for electric utilities. Quanta Technology

[63] Hong, T., & Fan, S. (2016). Probabilistic electric load forecasting: A tutorial review. International Journal of Forecasting, 32(3), 914-938.

[64] Salvó, G., &Piacquadio, M. N. (2017).Multifractal analysis of electricity demand as a tool for spatial forecasting. Energy for Sustainable Development, 38, 67-76.

[65] Temraz, H. K., Salama, M. M. A., & Chikhani, A. Y. (1997, May). Review of electric load forecasting methods. In CCECE'97. Canadian Conference on Electrical and Computer Engineering. Engineering Innovation: Voyage of Discovery. Conference Proceedings (Vol. 1, pp. 289-292). IEEE.

[66] Al-Hamadi, H. M. (2011, September). Long-term electric power load forecasting using fuzzy linear regression technique. In 2011 IEEE Power Engineering and Automation Conference (Vol. 3, pp. 96-99).IEEE.

[67] AlRashidi, M. R., & El-Naggar, K. M. (2010). Long term electric load forecasting based on particle swarm optimization. Applied Energy, 87(1), 320-326.

[68] Al-Saba, T., & El-Amin, I. (1999). Artificial neural networks as applied to long-term demand forecasting. Artificial Intelligence in Engineering, 13(2), 189-197

[69] Chatfield, C. (2001). Prediction intervals for time-series forecasting.In Principles of forecasting (pp. 475-494).Springer, Boston, MA.

146

[70] Khan, M. M. H., Muhammad, N. S., & El-Shafie, A. (2020). Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting. Journal of Hydrology, 125380.

[71] Dong, B., Li, Z., Rahman, S. M., & Vega, R. (2016).A hybrid model approach for forecasting future residential electricity consumption. Energy and Buildings, 117, 341-351.

[72] David, M., Ramahatana, F., Trombe, P. J., &Lauret, P. (2016).Probabilistic forecasting of the solar irradiance with recursive ARMA and GARCH models. Solar Energy, 133, 55-72

[73] Alsharif, M. H., Younes, M. K., & Kim, J. (2019). Time series arima model for prediction of daily and monthly average global solar radiation: The case study of seoul, south korea. Symmetry, 11(2), 240.

[74] Al-Musaylh, M. S., Deo, R. C., Adamowski, J. F., & Li, Y. (2018).Short-term electricity demand forecasting with MARS, SVR and ARIMA models using aggregated demand data in Queensland, Australia. Advanced Engineering Informatics, 35, 1-16.

[75] Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175.

[76] Bedi, J., &Toshniwal, D. (2019).Deep learning framework to forecast electricity demand. Applied energy, 238, 1312-1326.

[77] Khashei, M., &Bijari, M. (2010).An artificial neural network (p, d, q) model for timeseries forecasting. Expert Systems with applications, 37(1), 479-489.

[78] Babu, C. N., & Reddy, B. E. (2014). A moving-average filter based hybrid ARIMA–ANN model for forecasting Time Series Data. Applied Soft Computing, 23, 27-38.

[79] Khandelwal, I., Adhikari, R., &Verma, G. (2015). Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition. Procedia Computer Science, 48, 173-179.

[80] Lee, W. J., & Hong, J. (2015). A hybrid dynamic and fuzzy time series model for mid-term power load forecasting. International Journal of Electrical Power & Energy Systems, 64, 1057-1062.

[81] Rana, M., &Koprinska, I. (2016).Forecasting electricity load with advanced wavelet neural networks. Neurocomputing, 182, 118-132.

[82] Dudek, G. (2016). Pattern-based local linear regression models for short-term load forecasting. Electric Power Systems Research, 130, 139-147.

[83] Barak, S., &Sadegh, S. S. (2016).Forecasting energy consumption using ensemble ARIMA–ANFIS hybrid algorithm. International Journal of Electrical Power & Energy Systems, 82, 92-104.

[84] Zhou, H. C., Peng, Y., & Liang, G. H. (2008). The research of monthly discharge predictor-corrector model based on wavelet decomposition. Water resources management, 22(2), 217-227.

[85] Sun, T., Zhang, T., Teng, Y., Chen, Z., & Fang, J. (2019). Monthly Electricity Consumption Forecasting Method Based on X12 and STL Decomposition Model in an Integrated Energy System. Mathematical Problems in Engineering, 2019.

[86] Nury, A. H., Hasan, K., &Alam, M. J. B. (2017). Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature Time Series Data in north eastern Bangladesh, Journal of King Saud University-Science, 29(1), 47–61.

[87] Pannakkong, W., & Huynh, V. N. (2017, October). A Hybrid Model of ARIMA and ANN with Discrete Wavelet Transform for Time Series Forecasting. In International

147

Conference on Mod eling Decisions for Artificial Intelligence (pp. 159-169).Springer, Cham.

[88] Vasilakis, G. A., Theofilatos, K. A., Georgopoulos, E. F., Karathanasopoulos, A., & Likothanassis, S. D. (2013). A genetic programming approach for EUR/USD exchange rate forecasting and trading. Computational economics, 42(4), 415-431.

[89] Hajirahimi, Z., & Khashei, M. (2019). Hybrid structures in time series modeling and forecasting: A review. Engineering Applications of Artificial Intelligence, 86, 83-106.

[90] Saab, S., Badr, E., & Nasr, G. (2001). Univariate modeling and forecasting of energy consumption: the case of electricity in Lebanon. Energy, 26(1), 1-14.

[91] de Oliveira, E.M. and Oliveira, F.L.C., 2018. Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods. Energy, 144, pp.776-788.

[92] Nakanishi, I., Nishiguchi, N., Itoh, Y., & Fukui, Y. (2005). On‐line signature verification based on subband decomposition by DWT and adaptive signal processing. Electronics and Communications in Japan (Part III: Fundamental Electronic Science), 88(6), 1-11.

[93] van der Meer, D.W., Shepero, M., Svensson, A., Widén, J. and Munkhammar, J., 2018. Probabilistic forecasting of electricity consumption, photovoltaic power generation and net demand of an individual building using Gaussian Processes. Applied energy, 213, pp.195-207.

[94] Xiao, L., Shao, W., Yu, M., Ma, J. and Jin, C., 2017. Research and application of a hybrid wavelet neural network model with the improved cuckoo search algorithm for electrical power system forecasting. Applied Energy, 198, pp.203-222.

[95] Tealab, A., Hefny, H. and Badr, A., 2017. Forecasting of nonlinear time series using ANN. Future Computing and Informatics Journal, 2(1), pp.39-47.

[96] Suganthi, L. and Samuel, A.A., 2012. Energy models for demand forecasting—A review. Renewable and sustainable energy reviews, 16(2), pp.1223-1240.

[97] Feinberg, E.A. and Genethliou, D., 2005. Load forecasting. In Applied mathematics for restructured electric power systems (pp. 269-285). Springer, Boston, MA.

[98] Zhang, H., Zhang, S., Wang, P., Qin, Y. and Wang, H., 2017. Forecasting of particulate matter time series using wavelet analysis and wavelet-ARMA/ARIMA model in Taiyuan, China. Journal of the Air & Waste Management Association, 67(7), pp.776-788.

[99] Kriechbaumer, T., Angus, A., Parsons, D. and Casado, M.R., 2014. An improved wavelet–ARIMA approach for forecasting metal prices. Resources Policy, 39, pp.32-41.

[100] Li, D., 2018. Transforming time series for efficient and accurate classification (Doctoral dissertation, University of Luxembourg, Luxembourg, Luxembourg).

[101] Rafiei, M., Niknam, T., Aghaei, J., Shafie-Khah, M., & Catalão, J. P. (2018). Probabilistic load forecasting using an improved wavelet neural network trained by generalized extreme learning machine. IEEE Transactions on Smart Grid, 9(6), 6961-6971.

[102] Khandelwal, I., Satija, U. and Adhikari, R., 2015, July. Efficient financial time series forecasting model using DWT decomposition. In 2015 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT) (pp. 1-5). IEEE.

148

[103] Shumway, R.H. and Stoffer, D.S., 2017. Time series analysis and its applications: with R examples. Springer.

[104] Taieb SB, Huser R, Hyndman RJ, Genton MG. Forecasting uncertainty in electricity smart meter data by boosting additive quantile regression. IEEE Trans Smart Grid 2016;7(5):2448–55.

[105] Weron, R., 2007. Modeling and forecasting electricity loads and prices: A statistical approach (Vol. 403). John Wiley & Sons.

[106] Jin, J. and Kim, J., 2015. Forecasting natural gas prices using wavelets, time series, and artificial neural networks. PloS one, 10(11), p.e0142064.

[107] Yousefi, S., Weinreich, I. and Reinarz, D., 2005. Wavelet-based prediction of oil prices. Chaos, Solitons & Fractals, 25(2), pp.265-275.

[108] Tan, Z., Zhang, J., Wang, J. and Xu, J., 2010. Day-ahead electricity price forecasting using wavelet transform combined with ARIMA and GARCH models. Applied energy, 87(11), pp.3606-3610.

[109] Moazzami, M., Khodabakhshian, A. and Hooshmand, R., 2013. A new hybrid day-ahead peak load forecasting method for Iran’s National Grid. Applied Energy, 101, pp.489-501.

[110] Mellit, A., Benghanem, M. and Kalogirou, S.A., 2006. An adaptive wavelet-network model for forecasting daily total solar-radiation. Applied Energy, 83(7), pp.705-722.

[111] Liu, H., Tian, H.Q., Pan, D.F. and Li, Y.F., 2013. Forecasting models for wind speed using wavelet, wavelet packet, time series and Artificial Neural Networks. Applied Energy, 107, pp.191-208.

[112] Soltani, S., 2002. On the use of the wavelet decomposition for time series prediction. Neurocomputing, 48(1-4), pp.267-277.

[113] Ahmad, S., Popoola, A. and Ahmad, K., 2005. Wavelet-based multiresolution forecasting. University of Surrey, Technical Report.

[114] Shafie-Khah, M., Moghaddam, M.P. and Sheikh-El-Eslami, M.K., 2011. Price forecasting of day-ahead electricity markets using a hybrid forecast method. Energy Conversion and Management, 52(5), pp.2165-2169.

[115] Pindoriya, N.M., Singh, S.N. and Singh, S.K., 2008. An adaptive wavelet neural network-based energy price forecasting in electricity markets. IEEE Transactions On power systems, 23(3), pp.1423-1432.

[116] Bianchi, F.M., De Santis, E., Rizzi, A. and Sadeghian, A., 2015. Short-term electric load forecasting using echo state networks and PCA decomposition. Ieee Access, 3, pp.1931-1943.

[117] Gholipour Khajeh, M., Maleki, A., Rosen, M.A. and Ahmadi, M.H., 2018. Electricity price forecasting using neural networks with an improved iterative training algorithm. International Journal of Ambient Energy, 39(2), pp.147-158.

[118] Li, H., Li, Y. and Dong, H., 2017. A Comprehensive Learning-Based Model for Power Load Forecasting in Smart Grid. Computing and Informatics, 36(2), pp.470-492.

[119] Hernández, L., Baladrón, C., Aguiar, J.M., Calavia, L., Carro, B., Sánchez-Esguevillas, A., Pérez, F., Fernández, Á. and Lloret, J., 2014. Artificial neural network for short-term load forecasting in distribution systems. Energies, 7(3), pp.1576-1598.

[120] Hong, T., Wilson, J. and Xie, J., 2014. Long term probabilistic load forecasting and normalization with hourly information. IEEE Transactions on Smart Grid, 5(1), pp.456-462.

[121] Khairalla, M.A., Ning, X., AL-Jallad, N.T. and El-Faroug, M.O., 2018. Short-Term

149

Forecasting for Energy Consumption through Stacking Heterogeneous Ensemble Learning Model. Energies, 11(6), pp.1-21.

[122] Khan, G.M. and Arshad, R., 2016. Electricity Peak Load Forecasting using CGP based Neuro Evolutionary Techniques. International Journal of Computational Intelligence Systems, 9(2), pp.376-395.

Documents

NOVEL HYBRID ELECTRIC LOAD FORECASTING MODEL USING …