Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Research ArticleElectricity Theft Detection in Power Grids with DeepLearning and Random Forests
Shuan Li 1 YinghuaHan 1 Xu Yao1 Song Yingchen2 JinkuanWang3 andQiang Zhao2
1School of Computer and Communication Engineering Northeastern University at Qinhuangdao Qinhuangdao 066004 China2School of Control Engineering Northeastern University at Qinhuangdao Qinhuangdao 066004 China3College of Information Science and Engineering Northeastern University Shenyang 110819 China
Correspondence should be addressed to Yinghua Han yhhanneuqeducn
Received 19 March 2019 Revised 30 July 2019 Accepted 14 August 2019 Published 3 October 2019
Academic Editor Ping-Feng Pai
Copyright copy 2019 Shuan Li et al-is is an open access article distributed under the Creative CommonsAttribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
As one of the major factors of the nontechnical losses (NTLs) in distribution networks the electricity theft causes significant harmto power grids which influences power supply quality and reduces operating profits In order to help utility companies solve theproblems of inefficient electricity inspection and irregular power consumption a novel hybrid convolutional neural network-random forest (CNN-RF) model for automatic electricity theft detection is presented in this paper In this model a convolutionalneural network (CNN) firstly is designed to learn the features between different hours of the day and different days from massiveand varying smart meter data by the operations of convolution and downsampling In addition a dropout layer is added to retardthe risk of overfitting and the backpropagation algorithm is applied to update network parameters in the training phase Andthen the random forest (RF) is trained based on the obtained features to detect whether the consumer steals electricity To buildthe RF in the hybrid model the grid search algorithm is adopted to determine optimal parameters Finally experiments areconducted based on real energy consumption data and the results show that the proposed detection model outperforms othermethods in terms of accuracy and efficiency
1 Introduction
-e loss of energy in electricity transmission and distribu-tion is an important problem faced by power companies allover the world -e energy losses are usually classified intotechnical losses (TLs) and nontechnical losses (NTLs) [1]-e TL is inherent to the transportation of electricity whichis caused by internal actions in the power system compo-nents such as the transmission liner and transformers [2]the NTL is defined as the difference between total losses andTLs which is primarily caused by electricity theft Actuallythe electricity theft occurs mostly through physical attackslike line tapping meter breaking or meter reading tam-pering [3] -ese electricity fraud behaviours may bringabout the revenue loss of power companies As an examplethe losses caused by electricity theft are estimated as about$45 billion every year in the United States (US) [4] And it isestimated that utility companies worldwide lose more than20 billion every year in the form of electricity theft [5] In
addition electricity theft behaviours can also affect thepower system safety For instance the heavy load of electricalsystems caused by electricity theft may lead to fires whichthreaten the public safety -erefore accurate electricitytheft detection is crucial for power grid safety and stableness
With the implementation of the advanced meteringinfrastructure (AMI) in smart grids power utilities obtainedmassive amounts of electricity consumption data at a highfrequency from smart meters which is helpful for us todetect electricity theft [6 7] However every coin has twosides the AMI network opens the door for some newelectricity theft attacks -ese attacks in the AMI can belaunched by various means such as digital tools and cyberattacks -e primary means of electricity theft detectioninclude humanly examining unauthorized line diversionscomparing malicious meter records with the benign onesand checking problematic equipment or hardware How-ever these methods are extremely time-consuming andcostly during full verification of all meters in a system
HindawiJournal of Electrical and Computer EngineeringVolume 2019 Article ID 4136874 12 pageshttpsdoiorg10115520194136874
Besides these manual approaches cannot avoid cyber at-tacks In order to solve the problemsmentioned above manyapproaches have been put forward in the past years -esemethods are mainly categorized into state-based game-theory-based and artificial-intelligence-based models [8]
-e key idea of state-based detection [9ndash11] is based onspecial devices such as wireless sensors and distributiontransformers [12] -ese methods could detect electricitytheft but rely on the real-time acquisition of system topologyand additional physical measurements which is sometimesunattainable Game-based detection schemes formulate agame between electricity utility and theft and then differentdistributions of normal and abnormal behaviours can bederived from the game equilibrium As detailed in [13] theycan achieve a low cost and reasonable result for reducingenergy theft Yet formulating the utility function of allplayers (eg distributors regulators and thieves) is still achallenge Artificial-intelligence-based methods includemachine learning and deep learning methods Existingmachine learning solutions can be further categorized intoclassification and clustering models as is presented in[14ndash17] Although aforementioned machine learning de-tection methods are innovative and remarkable their per-formances are still not satisfactory enough for practice Forexample most of these approaches require manual featureextraction which partly results from their limited ability tohandle high-dimensional data Indeed traditional hand-designed features include the mean standard deviationmaximum and minimum of consumption data -e processof manual feature extraction is a tedious and time-con-suming task and cannot capture the 2D features from smartmeter data
Deep learning techniques for electricity theft detectionare studied in [18] where the authors present a comparisonbetween different deep learning architectures such as con-volutional neural networks (CNNs) long-short-termmemory (LSTM) recurrent neural networks (RNNs) andstacked autoencoders However the performance of thedetectors is investigated using synthetic data which does notallow a reliable assessment of the detectorrsquos performancecompared with shallow architectures Moreover the authorsin [19] proposed a deep neural network- (DNN-) basedcustomer-specific detector that can efficiently thwart suchcyber attacks In recent years the CNN has been applied togenerate useful and discriminative features from raw dataand has wide applications in different areas [20ndash22] -eseapplicationsmotivate the CNN applied for feature extractionfrom high-resolution smart meter data in electricity theftdetection In [23] a wide and deep convolutional neuralnetwork (CNN)model was developed and applied to analysethe electricity theft in smart grids
In a plain CNN the softmax classifier layer is the same asa general single hidden layer feedforward neural network(SLFN) and trained through the backpropagation algorithm[24] On the one hand the SLFN is likely to be overtrainedleading to degradation of its generalization performancewhen it performs the backpropagation algorithm On theother hand the backpropagation algorithm is based onempirical risk minimization which is sensitive to local
minima of training errors As mentioned above because ofthe shortcoming of the softmax classifier the CNN is notalways optimal for classification although it has shown greatadvantages in the feature extraction process -erefore it isurgent to find a better classifier which not only owns thesimilar ability as the softmax classifier but also can make fulluse of the obtained features In most classifiers the randomforest (RF) classifier takes advantage of two powerful ma-chine learning techniques including bagging and randomfeature selection which could overcome the limitation of thesoftmax classifier Inspired by these particular works a novelconvolutional neural network-random forest (CNN-RF)model is adopted for electricity theft detection -e CNN isproposed to automatically capture various features of cus-tomersrsquo consumption behaviours from smart meter datawhich is one of the key factors in the success of the electricitytheft detection model To improve detection performancethe RF is used to replace the softmax classifier detecting thepatterns of consumers based on extracted features -ismodel has been trained and tested with real data from all thecustomers of electricity utility in Ireland and London
2 Overview Flow
-e main aim of the methodology described in this paper isto provide the utilities with a ranked list of their customersaccording to their probability of having an anomaly in theirelectricity meter
As shown in Figure 1 the electricity theft detectionsystem is divided into three main stages as follows
(i) Data analysis and preprocess to explain the reasonof applying a CNN for feature extraction we firstlyanalyse the factors that affect the behaviours ofelectricity consumers For the data preprocess weconsider several tasks such as data cleaning (re-solving outliers) missing value imputation anddata transformation (normalization)
(ii) Generation of train and test datasets to evaluate theperformance of the methodology described in thispaper the preprocessed dataset is split into the traindataset and test dataset by the cross-validation al-gorithm -e train dataset is used to train the pa-rameters of our model whilst the test dataset is usedto assess how well the model generalizes to newunseen customer samples Given that electricitytheft consumers remarkably outnumber non-fraudulent ones the imbalanced nature of thedataset can have a major negative impact on theperformance of supervised machine learningmethods To reduce this bias the synthetic minorityoversampling technique (SMOT) algorithm is usedto make the number of electricity thefts and non-fraudulent consumers equal in the train dataset
(iii) Classification using the CNN-RF model in theproposed CNN-RF model the CNN firstly isdesigned to learn the features between differenthours of the day and different days from massiveand varying smart meter data by the operations of
2 Journal of Electrical and Computer Engineering
convolution and downsampling And then RFclassification is trained based on the obtained fea-tures to detect whether the consumer steals elec-tricity Finally the confusion matrix and receiver-operating characteristic (ROC) curves are used toevaluate the accuracy of the CNN-RF model on thetest dataset
3 Data Analysis and Preprocess
-e dataset was collected by Electric Ireland and SustainableEnergy Authority of Ireland (SEAI) in January 2012 [25]which consists of electricity usage data from over 5000residential households and businesses -e smart metersrecorded electricity consumption at a resolution of 30minduring 2009 and 2010 Customers who participated in thetrial had a smart meter installed in their homes and agreed totake part in the research -erefore it is a reasonable as-sumption that all samples belong to honest users Howevermalicious samples cannot be obtained since energy theftmight never or rarely happen for a given customer Toaddress this issue we generated 1200 malicious customers asdescribed in [26] -e large number and a variety of cus-tomers with a long period of measurements make thisdataset an excellent source for research in the area of smartmeter data analysis For each customer i there is a half-hourly meter reporting for a 525-day periodWe reduced thesampling rate to one per hour for each customer -issection firstly analyses smart meter data to describe therationale of applying a CNN for feature extraction -en itdescribes three main procedures utilized for preprocessingthe original energy consumption data data cleaning missingvalue imputation and data normalization
31 Data Analysis To introduce the rationale for applying aCNN for feature extraction rather than other machinelearning techniques this section analyses smart meter data-ere are the daily load curves of four different users(residential and nonresidential) as shown in Figure 2 and itcan be seen that the daily electricity consumption of non-residential users is significantly higher than that of resi-dential users
In addition Figures 3 and 4 show the daily load profiles ofa consumer during working days and holidays -e load filesare highly similar but slightly shifted In a CNN the filterweights are uniform for different regions -us the featurescalculated in the convolutional layer are invariant to smallshifts which means that relatively stable features can be ob-tained from varying load profiles As described in [27] a deepCNN first automatically extracts features from massive loadprofiles and support vector machine (SVM) identifies thecharacteristics of the consumers To further investigate the loadseries we plot the four seasonsrsquo load profiles for a customer Itcan be seen from Figure 5 that the electricity consumption ofthe consumer changes with the change of seasons
In summary the load consumption differs in bothmagnitude and time of use and is dependent on lifestyleseasons weather and many other uncontrollable factors-erefore consumer load profiles are affected not only byweather and conditions but also by the types of consumersand other factors
Based on the above analysis extracting features fromsmart meter data based on experience is difficult Howeverfeature extraction is a key factor in the success of the de-tection system Conventionally manual feature extractionrequires elaborately designed features for a specific problemthat make it uneasy to adapt to other domains In the CNNthe successive alternating convolutional and pooling layersare designed to learn progressively higher-level features (egtrend indicators sequence standard deviation and linearslope) with 2D historical electricity consumption data Inaddition highly nonlinear correlations exist between elec-tricity consumption and these influencing factors Sinceactivation function has been designed on convolutional andfully connected layers the CNN is able to model highlynonlinear correlations In this paper activation functionnamed ldquorectified linear unitrdquo (ReLU) is used because of itssparsity and minimizing gradient vanishing problem in theproposed CNN-RF model
32 Data Preprocess xit and xprimeit are defined as the energyconsumption for an honest or malicious customer i at timeinterval t In this section the main procedures utilized forpreprocessing raw electricity data are described as follows
Smart meterdata
Data analysis and preprocess CNN-RF model
Model evaluation
Train dataset
Test dataset
Generation of train and test datasets
CNN feature extractor
RF classifier
Training
Balanced dataset
Unbalanced dataset
SMOTalgorithm
Cross-validationalgorithm
Confusion matrix
ROC curves
Original dataset
Data preprocess
Data normalization
Data analysis
Nonresidential and residential users
Working days and holidays
The change of seasons
Data cleaning
Missing valueimputation
Figure 1 Flow of electricity theft detection
Journal of Electrical and Computer Engineering 3
(1) Data cleaning there are erroneous values (eg outliers)in raw data which correspond to peak electricityconsumption activities during holidays or special
occasions such as birthdays and celebrations In thispaper the ldquothree-sigma rule of thumbrdquo [28] is used torestore the outliers according to the following formula
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 1
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 2
0 5 10 15 20Time (h)
0
05
1
15
Usa
ge (k
W)
Nonresidential user 1
0 5 10 15 20Time (h)
5
10
15
20
Usa
ge (k
W)
Nonresidential user 2
Figure 2 Load profiles for nonresidential and residential users
0 5 10 15 20 25Time (h)
0
05
1
15
2
25
Usa
ge (k
Wmiddoth
)
MondayTuesdayWednesday
ThursdayFriday
Figure 3 Daily load profiles for a customer in working days
0 5 10 15 20 25Time (h)
0
02
04
06
08
1
12
14
Usa
ge (k
Wmiddoth
)
SaturdaySundayNew year
HalloweenChristmas
Figure 4 Daily load profiles for a customer during holidays
4 Journal of Electrical and Computer Engineering
F xit1113872 1113873 avg xit1113872 1113873 + 2σ xit1113872 1113873 xit gt xlowastit
xit else
⎧⎨
⎩ (1)
where xlowastit is computed by the mean avg(middot) and standarddeviation σ(middot) for each time interval comprising theweekdaytime pair for each month
(2) Missing value imputation because of various reasonssuch as storage issues and failure of smart meters thereare missing values in electricity consumption data-rough the analysis of original data it is found thatthere are two kinds of data missing one is the con-tinuous missing of multiple data and the solution is todelete the users when the number of missing valuesexceeds 10 the other is missing single data which isprocessed by the formula (2) -us missing values canbe recovered as follows
F xit1113872 1113873
xitminus 1 + xit+1
2 xit isin NaN
xit else
⎧⎪⎪⎪⎨
⎪⎪⎪⎩
(2)
where xit stands for the electricity usage of the consumer i
over a period (eg an hour) if xit is null we represent it asNaN
(3) Data normalization finally data need to be nor-malized because the neural network is sensitive to thediverse data One of the common methods for thisscope is min-max normalization which is computedas
F xit1113872 1113873 xit minus min xiT1113872 1113873
max xiT1113872 1113873 minus min xiT1113872 1113873 (3)
where min(middot) and max(middot) represent the min and max valuesover a day respectively
4 Generation of Train and Test Datasets
After data processing the SEAI dataset contains the elec-tricity consumption data of 4737 normal electricity cus-tomers and 1200 malicious electricity customers within525 days (with 24 hours)
Our preliminary analytical results (refer to Section 31)reveal the electricity usage consumer load profiles are af-fected not only by weather and conditions but also by thetypes of consumers and other factors However it is difficultto extracting features based on experience from the 1Delectricity consumption data since the electricity con-sumption every day fluctuates in a relatively independentway Motivated by the previous work in [29] we design adeep CNN to process the electricity consumption data in a2D manner In particular we transform the 1D electricityconsumption data into 2D data according to days We define2D data as a matrix of actual energy consumption values fora specific customer where the rows of 2D data representdays as D (D 1 525) and columns represent the timeperiods as T (T 1 24) It is worthmentioning that theCNN learns the features between different hours of the dayand different days from 2D data
-en we divide the dataset into train and test sets usingthe cross-validation method in which 80 are the train setand 20 are the test set Given that electricity theftconsumers remarkably outnumber nonfraudulent onesthe imbalanced nature of the dataset can have a majornegative impact on the performance of supervised ma-chine learning methods To reduce this bias the SMOTalgorithm is used to make the number of normal andabnormal samples equal in the train set Finally the traindataset contains 7484 customers in which the number ofnormal and abnormal samples is equal And the testdataset consists of 1669 customers
5 The Novel CNN-RF Algorithm
In this section the design of the CNN-RF structure ispresented in detail and some techniques are proposed toreduce overfitting Moreover correlative parameters in-cluding the number of filters and the size of each feature mapand kernel are given Finally the training process of theproposed CNN-RF algorithm is introduced
51 CNN-RF Architecture -e proposed CNN-RF model isdesigned by integrating CNN and RF classifiers As shown inFigure 6 the architecture of the CNN-RF mainly includes anautomatic feature extractor and a trainable RF classifier -efeature extractor CNN consists of convolutional layersdownsampling layers and a fully connection layer-e CNNis a deep supervised learning architecture which usuallyincludes multiple layers and can be trained with the back-propagation algorithm It also can explore the intricatedistribution in smart meter data by performing the
0 10 20 30 40 50 60 70 80 90 100Time (days)
0
5
10
15
20
25
30
Usa
ge (k
Wmiddotd
ay)
SpringSummer
AutumnWinter
Figure 5 Four seasonsrsquo load profiles for a customer
Journal of Electrical and Computer Engineering 5
stochastic gradient method -e RF classifier consists of acombination of tree classifiers in which each tree contrib-utes with a single vote to the assignment of themost frequentclass in input data [30] In the following the design of eachpart of the CNN-RF structure is presented in detail
511 Convolutional Layer -e main purpose of convolu-tional layers is to learn feature representation of input dataand reduce the influence of noise -e convolutional layer iscomposed of several feature filters which are used tocompute different feature maps Specially to reduce theparameters of networks and lower the complication ofrelevant layers during the process of generating featuremaps the weights of each kernel in each feature map areshared Moreover each neuron in the convolutional layerconnects the local region of front layers and then a ReLUactivation function is applied on the convolved resultsMathematically the outputs of the convolutional layers canbe expressed as
yconv Xfl
l1113872 1113873 δ 1113944
Fl
fl1W
fl
l lowastXfl
l + bfl
l⎛⎝ ⎞⎠ (4)
where δ is the activation function lowast is the convolutionoperation and both W
fl
l and bfl
l are the learnable param-eters in the f-th feature filter
512 Downsampling Layer Downsampling is an importantconcept of the CNN which is usually placed between twoconvolutional layers It can reduce the number of parametersand achieve dimensionality reduction In particular eachfeature map of the pooling layer is connected to its corre-sponding feature map of the previous convolutional layerAnd the size of output feature maps is reduced withoutreducing the number of output feature maps in thedownsampling layer In the literature downsampling op-erations mainly include max-pooling and average-pooling-e max-pooling operation transforms small windows intosingle values by maximum which is expressed as formula 5but average-pooling returns average values of activations in asmall window In [31] experiments show that the max-pooling operation shows better performance than average-pooling -erefore the max-pooling operation is usuallyimplemented in the downsampling layer
ypool Xfl
l1113872 1113873 maxmisinM
Xlm1113966 1113967 (5)
where M is a set of activations in the pooling window andmis the index of activations in the pooling window
513 Fully Connected Layer With the features extracted bysequential convolution and pooling a fully connected layeris applied for flattening feature maps into one vector asfollows
yflXl( 1113857 δ Wl middot Xl + bl( 1113857 (6)
where Wl is the weight of the l-th layer and bl is the bias ofthe l-th layer
514 RF Classifier Layer Traditionally the softmax clas-sifier is used for the last output layer in the CNN but theproposed model uses the RF classifier to predict the classbased on the obtained features -erefore the RF classifierlayer can be defined as
yout XL( 1113857 sigm Wrf middot Xl + bl( 1113857 (7)
where sigm(middot) is the sigmoid function which maps ab-normal values to 0 and normal values to 1-e parameter setWrf in the RF layer includes the number of decision treesand the maximum depth of the tree which are obtained bythe grid search algorithm
52 Techniques for Selecting Parameters and AvoidingOverfitting -e detailed parameters of the proposed CNN-RF structure are summarized in Table 1 which include thenumbers of filters in each layer filter size and stride Twofactors are considered in determining the CNN structure-e first factor is the characteristics of consumersrsquo electricityconsumption behaviours Since the load files are such avariable two convolutional layers with a filter size of 3lowast 3(stride 1) are applied to capture hidden features for detectingelectricity theft Moreover because the dimension of inputdata size is 525 times 24 which is similar to what is used in imagerecognition problems two max-pooling layers with a filtersize of 2lowast 2 (stride 2) are used -e second factor is thenumber of training samples since the number of samples islimited to reduce the risk of overfitting we design a fullyconnected layer with a dropout rate of 04 whose main idea
d d
d d d d
n a n a n a n a
Input data Convolutional layer Convolutional layer
Downsamplinglayer
Downsamplinglayer
L232525 lowast 24 L332268 lowast 12 L464263 lowast 12 L564132 lowast 6L1525 lowast 24Fully connected layer
Random forest layer
L6132 lowast 1
Figure 6 Architecture of the hybrid CNN-RF model
6 Journal of Electrical and Computer Engineering
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Besides these manual approaches cannot avoid cyber at-tacks In order to solve the problemsmentioned above manyapproaches have been put forward in the past years -esemethods are mainly categorized into state-based game-theory-based and artificial-intelligence-based models [8]
-e key idea of state-based detection [9ndash11] is based onspecial devices such as wireless sensors and distributiontransformers [12] -ese methods could detect electricitytheft but rely on the real-time acquisition of system topologyand additional physical measurements which is sometimesunattainable Game-based detection schemes formulate agame between electricity utility and theft and then differentdistributions of normal and abnormal behaviours can bederived from the game equilibrium As detailed in [13] theycan achieve a low cost and reasonable result for reducingenergy theft Yet formulating the utility function of allplayers (eg distributors regulators and thieves) is still achallenge Artificial-intelligence-based methods includemachine learning and deep learning methods Existingmachine learning solutions can be further categorized intoclassification and clustering models as is presented in[14ndash17] Although aforementioned machine learning de-tection methods are innovative and remarkable their per-formances are still not satisfactory enough for practice Forexample most of these approaches require manual featureextraction which partly results from their limited ability tohandle high-dimensional data Indeed traditional hand-designed features include the mean standard deviationmaximum and minimum of consumption data -e processof manual feature extraction is a tedious and time-con-suming task and cannot capture the 2D features from smartmeter data
Deep learning techniques for electricity theft detectionare studied in [18] where the authors present a comparisonbetween different deep learning architectures such as con-volutional neural networks (CNNs) long-short-termmemory (LSTM) recurrent neural networks (RNNs) andstacked autoencoders However the performance of thedetectors is investigated using synthetic data which does notallow a reliable assessment of the detectorrsquos performancecompared with shallow architectures Moreover the authorsin [19] proposed a deep neural network- (DNN-) basedcustomer-specific detector that can efficiently thwart suchcyber attacks In recent years the CNN has been applied togenerate useful and discriminative features from raw dataand has wide applications in different areas [20ndash22] -eseapplicationsmotivate the CNN applied for feature extractionfrom high-resolution smart meter data in electricity theftdetection In [23] a wide and deep convolutional neuralnetwork (CNN)model was developed and applied to analysethe electricity theft in smart grids
In a plain CNN the softmax classifier layer is the same asa general single hidden layer feedforward neural network(SLFN) and trained through the backpropagation algorithm[24] On the one hand the SLFN is likely to be overtrainedleading to degradation of its generalization performancewhen it performs the backpropagation algorithm On theother hand the backpropagation algorithm is based onempirical risk minimization which is sensitive to local
minima of training errors As mentioned above because ofthe shortcoming of the softmax classifier the CNN is notalways optimal for classification although it has shown greatadvantages in the feature extraction process -erefore it isurgent to find a better classifier which not only owns thesimilar ability as the softmax classifier but also can make fulluse of the obtained features In most classifiers the randomforest (RF) classifier takes advantage of two powerful ma-chine learning techniques including bagging and randomfeature selection which could overcome the limitation of thesoftmax classifier Inspired by these particular works a novelconvolutional neural network-random forest (CNN-RF)model is adopted for electricity theft detection -e CNN isproposed to automatically capture various features of cus-tomersrsquo consumption behaviours from smart meter datawhich is one of the key factors in the success of the electricitytheft detection model To improve detection performancethe RF is used to replace the softmax classifier detecting thepatterns of consumers based on extracted features -ismodel has been trained and tested with real data from all thecustomers of electricity utility in Ireland and London
2 Overview Flow
-e main aim of the methodology described in this paper isto provide the utilities with a ranked list of their customersaccording to their probability of having an anomaly in theirelectricity meter
As shown in Figure 1 the electricity theft detectionsystem is divided into three main stages as follows
(i) Data analysis and preprocess to explain the reasonof applying a CNN for feature extraction we firstlyanalyse the factors that affect the behaviours ofelectricity consumers For the data preprocess weconsider several tasks such as data cleaning (re-solving outliers) missing value imputation anddata transformation (normalization)
(ii) Generation of train and test datasets to evaluate theperformance of the methodology described in thispaper the preprocessed dataset is split into the traindataset and test dataset by the cross-validation al-gorithm -e train dataset is used to train the pa-rameters of our model whilst the test dataset is usedto assess how well the model generalizes to newunseen customer samples Given that electricitytheft consumers remarkably outnumber non-fraudulent ones the imbalanced nature of thedataset can have a major negative impact on theperformance of supervised machine learningmethods To reduce this bias the synthetic minorityoversampling technique (SMOT) algorithm is usedto make the number of electricity thefts and non-fraudulent consumers equal in the train dataset
(iii) Classification using the CNN-RF model in theproposed CNN-RF model the CNN firstly isdesigned to learn the features between differenthours of the day and different days from massiveand varying smart meter data by the operations of
2 Journal of Electrical and Computer Engineering
convolution and downsampling And then RFclassification is trained based on the obtained fea-tures to detect whether the consumer steals elec-tricity Finally the confusion matrix and receiver-operating characteristic (ROC) curves are used toevaluate the accuracy of the CNN-RF model on thetest dataset
3 Data Analysis and Preprocess
-e dataset was collected by Electric Ireland and SustainableEnergy Authority of Ireland (SEAI) in January 2012 [25]which consists of electricity usage data from over 5000residential households and businesses -e smart metersrecorded electricity consumption at a resolution of 30minduring 2009 and 2010 Customers who participated in thetrial had a smart meter installed in their homes and agreed totake part in the research -erefore it is a reasonable as-sumption that all samples belong to honest users Howevermalicious samples cannot be obtained since energy theftmight never or rarely happen for a given customer Toaddress this issue we generated 1200 malicious customers asdescribed in [26] -e large number and a variety of cus-tomers with a long period of measurements make thisdataset an excellent source for research in the area of smartmeter data analysis For each customer i there is a half-hourly meter reporting for a 525-day periodWe reduced thesampling rate to one per hour for each customer -issection firstly analyses smart meter data to describe therationale of applying a CNN for feature extraction -en itdescribes three main procedures utilized for preprocessingthe original energy consumption data data cleaning missingvalue imputation and data normalization
31 Data Analysis To introduce the rationale for applying aCNN for feature extraction rather than other machinelearning techniques this section analyses smart meter data-ere are the daily load curves of four different users(residential and nonresidential) as shown in Figure 2 and itcan be seen that the daily electricity consumption of non-residential users is significantly higher than that of resi-dential users
In addition Figures 3 and 4 show the daily load profiles ofa consumer during working days and holidays -e load filesare highly similar but slightly shifted In a CNN the filterweights are uniform for different regions -us the featurescalculated in the convolutional layer are invariant to smallshifts which means that relatively stable features can be ob-tained from varying load profiles As described in [27] a deepCNN first automatically extracts features from massive loadprofiles and support vector machine (SVM) identifies thecharacteristics of the consumers To further investigate the loadseries we plot the four seasonsrsquo load profiles for a customer Itcan be seen from Figure 5 that the electricity consumption ofthe consumer changes with the change of seasons
In summary the load consumption differs in bothmagnitude and time of use and is dependent on lifestyleseasons weather and many other uncontrollable factors-erefore consumer load profiles are affected not only byweather and conditions but also by the types of consumersand other factors
Based on the above analysis extracting features fromsmart meter data based on experience is difficult Howeverfeature extraction is a key factor in the success of the de-tection system Conventionally manual feature extractionrequires elaborately designed features for a specific problemthat make it uneasy to adapt to other domains In the CNNthe successive alternating convolutional and pooling layersare designed to learn progressively higher-level features (egtrend indicators sequence standard deviation and linearslope) with 2D historical electricity consumption data Inaddition highly nonlinear correlations exist between elec-tricity consumption and these influencing factors Sinceactivation function has been designed on convolutional andfully connected layers the CNN is able to model highlynonlinear correlations In this paper activation functionnamed ldquorectified linear unitrdquo (ReLU) is used because of itssparsity and minimizing gradient vanishing problem in theproposed CNN-RF model
32 Data Preprocess xit and xprimeit are defined as the energyconsumption for an honest or malicious customer i at timeinterval t In this section the main procedures utilized forpreprocessing raw electricity data are described as follows
Smart meterdata
Data analysis and preprocess CNN-RF model
Model evaluation
Train dataset
Test dataset
Generation of train and test datasets
CNN feature extractor
RF classifier
Training
Balanced dataset
Unbalanced dataset
SMOTalgorithm
Cross-validationalgorithm
Confusion matrix
ROC curves
Original dataset
Data preprocess
Data normalization
Data analysis
Nonresidential and residential users
Working days and holidays
The change of seasons
Data cleaning
Missing valueimputation
Figure 1 Flow of electricity theft detection
Journal of Electrical and Computer Engineering 3
(1) Data cleaning there are erroneous values (eg outliers)in raw data which correspond to peak electricityconsumption activities during holidays or special
occasions such as birthdays and celebrations In thispaper the ldquothree-sigma rule of thumbrdquo [28] is used torestore the outliers according to the following formula
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 1
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 2
0 5 10 15 20Time (h)
0
05
1
15
Usa
ge (k
W)
Nonresidential user 1
0 5 10 15 20Time (h)
5
10
15
20
Usa
ge (k
W)
Nonresidential user 2
Figure 2 Load profiles for nonresidential and residential users
0 5 10 15 20 25Time (h)
0
05
1
15
2
25
Usa
ge (k
Wmiddoth
)
MondayTuesdayWednesday
ThursdayFriday
Figure 3 Daily load profiles for a customer in working days
0 5 10 15 20 25Time (h)
0
02
04
06
08
1
12
14
Usa
ge (k
Wmiddoth
)
SaturdaySundayNew year
HalloweenChristmas
Figure 4 Daily load profiles for a customer during holidays
4 Journal of Electrical and Computer Engineering
F xit1113872 1113873 avg xit1113872 1113873 + 2σ xit1113872 1113873 xit gt xlowastit
xit else
⎧⎨
⎩ (1)
where xlowastit is computed by the mean avg(middot) and standarddeviation σ(middot) for each time interval comprising theweekdaytime pair for each month
(2) Missing value imputation because of various reasonssuch as storage issues and failure of smart meters thereare missing values in electricity consumption data-rough the analysis of original data it is found thatthere are two kinds of data missing one is the con-tinuous missing of multiple data and the solution is todelete the users when the number of missing valuesexceeds 10 the other is missing single data which isprocessed by the formula (2) -us missing values canbe recovered as follows
F xit1113872 1113873
xitminus 1 + xit+1
2 xit isin NaN
xit else
⎧⎪⎪⎪⎨
⎪⎪⎪⎩
(2)
where xit stands for the electricity usage of the consumer i
over a period (eg an hour) if xit is null we represent it asNaN
(3) Data normalization finally data need to be nor-malized because the neural network is sensitive to thediverse data One of the common methods for thisscope is min-max normalization which is computedas
F xit1113872 1113873 xit minus min xiT1113872 1113873
max xiT1113872 1113873 minus min xiT1113872 1113873 (3)
where min(middot) and max(middot) represent the min and max valuesover a day respectively
4 Generation of Train and Test Datasets
After data processing the SEAI dataset contains the elec-tricity consumption data of 4737 normal electricity cus-tomers and 1200 malicious electricity customers within525 days (with 24 hours)
Our preliminary analytical results (refer to Section 31)reveal the electricity usage consumer load profiles are af-fected not only by weather and conditions but also by thetypes of consumers and other factors However it is difficultto extracting features based on experience from the 1Delectricity consumption data since the electricity con-sumption every day fluctuates in a relatively independentway Motivated by the previous work in [29] we design adeep CNN to process the electricity consumption data in a2D manner In particular we transform the 1D electricityconsumption data into 2D data according to days We define2D data as a matrix of actual energy consumption values fora specific customer where the rows of 2D data representdays as D (D 1 525) and columns represent the timeperiods as T (T 1 24) It is worthmentioning that theCNN learns the features between different hours of the dayand different days from 2D data
-en we divide the dataset into train and test sets usingthe cross-validation method in which 80 are the train setand 20 are the test set Given that electricity theftconsumers remarkably outnumber nonfraudulent onesthe imbalanced nature of the dataset can have a majornegative impact on the performance of supervised ma-chine learning methods To reduce this bias the SMOTalgorithm is used to make the number of normal andabnormal samples equal in the train set Finally the traindataset contains 7484 customers in which the number ofnormal and abnormal samples is equal And the testdataset consists of 1669 customers
5 The Novel CNN-RF Algorithm
In this section the design of the CNN-RF structure ispresented in detail and some techniques are proposed toreduce overfitting Moreover correlative parameters in-cluding the number of filters and the size of each feature mapand kernel are given Finally the training process of theproposed CNN-RF algorithm is introduced
51 CNN-RF Architecture -e proposed CNN-RF model isdesigned by integrating CNN and RF classifiers As shown inFigure 6 the architecture of the CNN-RF mainly includes anautomatic feature extractor and a trainable RF classifier -efeature extractor CNN consists of convolutional layersdownsampling layers and a fully connection layer-e CNNis a deep supervised learning architecture which usuallyincludes multiple layers and can be trained with the back-propagation algorithm It also can explore the intricatedistribution in smart meter data by performing the
0 10 20 30 40 50 60 70 80 90 100Time (days)
0
5
10
15
20
25
30
Usa
ge (k
Wmiddotd
ay)
SpringSummer
AutumnWinter
Figure 5 Four seasonsrsquo load profiles for a customer
Journal of Electrical and Computer Engineering 5
stochastic gradient method -e RF classifier consists of acombination of tree classifiers in which each tree contrib-utes with a single vote to the assignment of themost frequentclass in input data [30] In the following the design of eachpart of the CNN-RF structure is presented in detail
511 Convolutional Layer -e main purpose of convolu-tional layers is to learn feature representation of input dataand reduce the influence of noise -e convolutional layer iscomposed of several feature filters which are used tocompute different feature maps Specially to reduce theparameters of networks and lower the complication ofrelevant layers during the process of generating featuremaps the weights of each kernel in each feature map areshared Moreover each neuron in the convolutional layerconnects the local region of front layers and then a ReLUactivation function is applied on the convolved resultsMathematically the outputs of the convolutional layers canbe expressed as
yconv Xfl
l1113872 1113873 δ 1113944
Fl
fl1W
fl
l lowastXfl
l + bfl
l⎛⎝ ⎞⎠ (4)
where δ is the activation function lowast is the convolutionoperation and both W
fl
l and bfl
l are the learnable param-eters in the f-th feature filter
512 Downsampling Layer Downsampling is an importantconcept of the CNN which is usually placed between twoconvolutional layers It can reduce the number of parametersand achieve dimensionality reduction In particular eachfeature map of the pooling layer is connected to its corre-sponding feature map of the previous convolutional layerAnd the size of output feature maps is reduced withoutreducing the number of output feature maps in thedownsampling layer In the literature downsampling op-erations mainly include max-pooling and average-pooling-e max-pooling operation transforms small windows intosingle values by maximum which is expressed as formula 5but average-pooling returns average values of activations in asmall window In [31] experiments show that the max-pooling operation shows better performance than average-pooling -erefore the max-pooling operation is usuallyimplemented in the downsampling layer
ypool Xfl
l1113872 1113873 maxmisinM
Xlm1113966 1113967 (5)
where M is a set of activations in the pooling window andmis the index of activations in the pooling window
513 Fully Connected Layer With the features extracted bysequential convolution and pooling a fully connected layeris applied for flattening feature maps into one vector asfollows
yflXl( 1113857 δ Wl middot Xl + bl( 1113857 (6)
where Wl is the weight of the l-th layer and bl is the bias ofthe l-th layer
514 RF Classifier Layer Traditionally the softmax clas-sifier is used for the last output layer in the CNN but theproposed model uses the RF classifier to predict the classbased on the obtained features -erefore the RF classifierlayer can be defined as
yout XL( 1113857 sigm Wrf middot Xl + bl( 1113857 (7)
where sigm(middot) is the sigmoid function which maps ab-normal values to 0 and normal values to 1-e parameter setWrf in the RF layer includes the number of decision treesand the maximum depth of the tree which are obtained bythe grid search algorithm
52 Techniques for Selecting Parameters and AvoidingOverfitting -e detailed parameters of the proposed CNN-RF structure are summarized in Table 1 which include thenumbers of filters in each layer filter size and stride Twofactors are considered in determining the CNN structure-e first factor is the characteristics of consumersrsquo electricityconsumption behaviours Since the load files are such avariable two convolutional layers with a filter size of 3lowast 3(stride 1) are applied to capture hidden features for detectingelectricity theft Moreover because the dimension of inputdata size is 525 times 24 which is similar to what is used in imagerecognition problems two max-pooling layers with a filtersize of 2lowast 2 (stride 2) are used -e second factor is thenumber of training samples since the number of samples islimited to reduce the risk of overfitting we design a fullyconnected layer with a dropout rate of 04 whose main idea
d d
d d d d
n a n a n a n a
Input data Convolutional layer Convolutional layer
Downsamplinglayer
Downsamplinglayer
L232525 lowast 24 L332268 lowast 12 L464263 lowast 12 L564132 lowast 6L1525 lowast 24Fully connected layer
Random forest layer
L6132 lowast 1
Figure 6 Architecture of the hybrid CNN-RF model
6 Journal of Electrical and Computer Engineering
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
convolution and downsampling And then RFclassification is trained based on the obtained fea-tures to detect whether the consumer steals elec-tricity Finally the confusion matrix and receiver-operating characteristic (ROC) curves are used toevaluate the accuracy of the CNN-RF model on thetest dataset
3 Data Analysis and Preprocess
-e dataset was collected by Electric Ireland and SustainableEnergy Authority of Ireland (SEAI) in January 2012 [25]which consists of electricity usage data from over 5000residential households and businesses -e smart metersrecorded electricity consumption at a resolution of 30minduring 2009 and 2010 Customers who participated in thetrial had a smart meter installed in their homes and agreed totake part in the research -erefore it is a reasonable as-sumption that all samples belong to honest users Howevermalicious samples cannot be obtained since energy theftmight never or rarely happen for a given customer Toaddress this issue we generated 1200 malicious customers asdescribed in [26] -e large number and a variety of cus-tomers with a long period of measurements make thisdataset an excellent source for research in the area of smartmeter data analysis For each customer i there is a half-hourly meter reporting for a 525-day periodWe reduced thesampling rate to one per hour for each customer -issection firstly analyses smart meter data to describe therationale of applying a CNN for feature extraction -en itdescribes three main procedures utilized for preprocessingthe original energy consumption data data cleaning missingvalue imputation and data normalization
31 Data Analysis To introduce the rationale for applying aCNN for feature extraction rather than other machinelearning techniques this section analyses smart meter data-ere are the daily load curves of four different users(residential and nonresidential) as shown in Figure 2 and itcan be seen that the daily electricity consumption of non-residential users is significantly higher than that of resi-dential users
In addition Figures 3 and 4 show the daily load profiles ofa consumer during working days and holidays -e load filesare highly similar but slightly shifted In a CNN the filterweights are uniform for different regions -us the featurescalculated in the convolutional layer are invariant to smallshifts which means that relatively stable features can be ob-tained from varying load profiles As described in [27] a deepCNN first automatically extracts features from massive loadprofiles and support vector machine (SVM) identifies thecharacteristics of the consumers To further investigate the loadseries we plot the four seasonsrsquo load profiles for a customer Itcan be seen from Figure 5 that the electricity consumption ofthe consumer changes with the change of seasons
In summary the load consumption differs in bothmagnitude and time of use and is dependent on lifestyleseasons weather and many other uncontrollable factors-erefore consumer load profiles are affected not only byweather and conditions but also by the types of consumersand other factors
Based on the above analysis extracting features fromsmart meter data based on experience is difficult Howeverfeature extraction is a key factor in the success of the de-tection system Conventionally manual feature extractionrequires elaborately designed features for a specific problemthat make it uneasy to adapt to other domains In the CNNthe successive alternating convolutional and pooling layersare designed to learn progressively higher-level features (egtrend indicators sequence standard deviation and linearslope) with 2D historical electricity consumption data Inaddition highly nonlinear correlations exist between elec-tricity consumption and these influencing factors Sinceactivation function has been designed on convolutional andfully connected layers the CNN is able to model highlynonlinear correlations In this paper activation functionnamed ldquorectified linear unitrdquo (ReLU) is used because of itssparsity and minimizing gradient vanishing problem in theproposed CNN-RF model
32 Data Preprocess xit and xprimeit are defined as the energyconsumption for an honest or malicious customer i at timeinterval t In this section the main procedures utilized forpreprocessing raw electricity data are described as follows
Smart meterdata
Data analysis and preprocess CNN-RF model
Model evaluation
Train dataset
Test dataset
Generation of train and test datasets
CNN feature extractor
RF classifier
Training
Balanced dataset
Unbalanced dataset
SMOTalgorithm
Cross-validationalgorithm
Confusion matrix
ROC curves
Original dataset
Data preprocess
Data normalization
Data analysis
Nonresidential and residential users
Working days and holidays
The change of seasons
Data cleaning
Missing valueimputation
Figure 1 Flow of electricity theft detection
Journal of Electrical and Computer Engineering 3
(1) Data cleaning there are erroneous values (eg outliers)in raw data which correspond to peak electricityconsumption activities during holidays or special
occasions such as birthdays and celebrations In thispaper the ldquothree-sigma rule of thumbrdquo [28] is used torestore the outliers according to the following formula
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 1
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 2
0 5 10 15 20Time (h)
0
05
1
15
Usa
ge (k
W)
Nonresidential user 1
0 5 10 15 20Time (h)
5
10
15
20
Usa
ge (k
W)
Nonresidential user 2
Figure 2 Load profiles for nonresidential and residential users
0 5 10 15 20 25Time (h)
0
05
1
15
2
25
Usa
ge (k
Wmiddoth
)
MondayTuesdayWednesday
ThursdayFriday
Figure 3 Daily load profiles for a customer in working days
0 5 10 15 20 25Time (h)
0
02
04
06
08
1
12
14
Usa
ge (k
Wmiddoth
)
SaturdaySundayNew year
HalloweenChristmas
Figure 4 Daily load profiles for a customer during holidays
4 Journal of Electrical and Computer Engineering
F xit1113872 1113873 avg xit1113872 1113873 + 2σ xit1113872 1113873 xit gt xlowastit
xit else
⎧⎨
⎩ (1)
where xlowastit is computed by the mean avg(middot) and standarddeviation σ(middot) for each time interval comprising theweekdaytime pair for each month
(2) Missing value imputation because of various reasonssuch as storage issues and failure of smart meters thereare missing values in electricity consumption data-rough the analysis of original data it is found thatthere are two kinds of data missing one is the con-tinuous missing of multiple data and the solution is todelete the users when the number of missing valuesexceeds 10 the other is missing single data which isprocessed by the formula (2) -us missing values canbe recovered as follows
F xit1113872 1113873
xitminus 1 + xit+1
2 xit isin NaN
xit else
⎧⎪⎪⎪⎨
⎪⎪⎪⎩
(2)
where xit stands for the electricity usage of the consumer i
over a period (eg an hour) if xit is null we represent it asNaN
(3) Data normalization finally data need to be nor-malized because the neural network is sensitive to thediverse data One of the common methods for thisscope is min-max normalization which is computedas
F xit1113872 1113873 xit minus min xiT1113872 1113873
max xiT1113872 1113873 minus min xiT1113872 1113873 (3)
where min(middot) and max(middot) represent the min and max valuesover a day respectively
4 Generation of Train and Test Datasets
After data processing the SEAI dataset contains the elec-tricity consumption data of 4737 normal electricity cus-tomers and 1200 malicious electricity customers within525 days (with 24 hours)
Our preliminary analytical results (refer to Section 31)reveal the electricity usage consumer load profiles are af-fected not only by weather and conditions but also by thetypes of consumers and other factors However it is difficultto extracting features based on experience from the 1Delectricity consumption data since the electricity con-sumption every day fluctuates in a relatively independentway Motivated by the previous work in [29] we design adeep CNN to process the electricity consumption data in a2D manner In particular we transform the 1D electricityconsumption data into 2D data according to days We define2D data as a matrix of actual energy consumption values fora specific customer where the rows of 2D data representdays as D (D 1 525) and columns represent the timeperiods as T (T 1 24) It is worthmentioning that theCNN learns the features between different hours of the dayand different days from 2D data
-en we divide the dataset into train and test sets usingthe cross-validation method in which 80 are the train setand 20 are the test set Given that electricity theftconsumers remarkably outnumber nonfraudulent onesthe imbalanced nature of the dataset can have a majornegative impact on the performance of supervised ma-chine learning methods To reduce this bias the SMOTalgorithm is used to make the number of normal andabnormal samples equal in the train set Finally the traindataset contains 7484 customers in which the number ofnormal and abnormal samples is equal And the testdataset consists of 1669 customers
5 The Novel CNN-RF Algorithm
In this section the design of the CNN-RF structure ispresented in detail and some techniques are proposed toreduce overfitting Moreover correlative parameters in-cluding the number of filters and the size of each feature mapand kernel are given Finally the training process of theproposed CNN-RF algorithm is introduced
51 CNN-RF Architecture -e proposed CNN-RF model isdesigned by integrating CNN and RF classifiers As shown inFigure 6 the architecture of the CNN-RF mainly includes anautomatic feature extractor and a trainable RF classifier -efeature extractor CNN consists of convolutional layersdownsampling layers and a fully connection layer-e CNNis a deep supervised learning architecture which usuallyincludes multiple layers and can be trained with the back-propagation algorithm It also can explore the intricatedistribution in smart meter data by performing the
0 10 20 30 40 50 60 70 80 90 100Time (days)
0
5
10
15
20
25
30
Usa
ge (k
Wmiddotd
ay)
SpringSummer
AutumnWinter
Figure 5 Four seasonsrsquo load profiles for a customer
Journal of Electrical and Computer Engineering 5
stochastic gradient method -e RF classifier consists of acombination of tree classifiers in which each tree contrib-utes with a single vote to the assignment of themost frequentclass in input data [30] In the following the design of eachpart of the CNN-RF structure is presented in detail
511 Convolutional Layer -e main purpose of convolu-tional layers is to learn feature representation of input dataand reduce the influence of noise -e convolutional layer iscomposed of several feature filters which are used tocompute different feature maps Specially to reduce theparameters of networks and lower the complication ofrelevant layers during the process of generating featuremaps the weights of each kernel in each feature map areshared Moreover each neuron in the convolutional layerconnects the local region of front layers and then a ReLUactivation function is applied on the convolved resultsMathematically the outputs of the convolutional layers canbe expressed as
yconv Xfl
l1113872 1113873 δ 1113944
Fl
fl1W
fl
l lowastXfl
l + bfl
l⎛⎝ ⎞⎠ (4)
where δ is the activation function lowast is the convolutionoperation and both W
fl
l and bfl
l are the learnable param-eters in the f-th feature filter
512 Downsampling Layer Downsampling is an importantconcept of the CNN which is usually placed between twoconvolutional layers It can reduce the number of parametersand achieve dimensionality reduction In particular eachfeature map of the pooling layer is connected to its corre-sponding feature map of the previous convolutional layerAnd the size of output feature maps is reduced withoutreducing the number of output feature maps in thedownsampling layer In the literature downsampling op-erations mainly include max-pooling and average-pooling-e max-pooling operation transforms small windows intosingle values by maximum which is expressed as formula 5but average-pooling returns average values of activations in asmall window In [31] experiments show that the max-pooling operation shows better performance than average-pooling -erefore the max-pooling operation is usuallyimplemented in the downsampling layer
ypool Xfl
l1113872 1113873 maxmisinM
Xlm1113966 1113967 (5)
where M is a set of activations in the pooling window andmis the index of activations in the pooling window
513 Fully Connected Layer With the features extracted bysequential convolution and pooling a fully connected layeris applied for flattening feature maps into one vector asfollows
yflXl( 1113857 δ Wl middot Xl + bl( 1113857 (6)
where Wl is the weight of the l-th layer and bl is the bias ofthe l-th layer
514 RF Classifier Layer Traditionally the softmax clas-sifier is used for the last output layer in the CNN but theproposed model uses the RF classifier to predict the classbased on the obtained features -erefore the RF classifierlayer can be defined as
yout XL( 1113857 sigm Wrf middot Xl + bl( 1113857 (7)
where sigm(middot) is the sigmoid function which maps ab-normal values to 0 and normal values to 1-e parameter setWrf in the RF layer includes the number of decision treesand the maximum depth of the tree which are obtained bythe grid search algorithm
52 Techniques for Selecting Parameters and AvoidingOverfitting -e detailed parameters of the proposed CNN-RF structure are summarized in Table 1 which include thenumbers of filters in each layer filter size and stride Twofactors are considered in determining the CNN structure-e first factor is the characteristics of consumersrsquo electricityconsumption behaviours Since the load files are such avariable two convolutional layers with a filter size of 3lowast 3(stride 1) are applied to capture hidden features for detectingelectricity theft Moreover because the dimension of inputdata size is 525 times 24 which is similar to what is used in imagerecognition problems two max-pooling layers with a filtersize of 2lowast 2 (stride 2) are used -e second factor is thenumber of training samples since the number of samples islimited to reduce the risk of overfitting we design a fullyconnected layer with a dropout rate of 04 whose main idea
d d
d d d d
n a n a n a n a
Input data Convolutional layer Convolutional layer
Downsamplinglayer
Downsamplinglayer
L232525 lowast 24 L332268 lowast 12 L464263 lowast 12 L564132 lowast 6L1525 lowast 24Fully connected layer
Random forest layer
L6132 lowast 1
Figure 6 Architecture of the hybrid CNN-RF model
6 Journal of Electrical and Computer Engineering
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
(1) Data cleaning there are erroneous values (eg outliers)in raw data which correspond to peak electricityconsumption activities during holidays or special
occasions such as birthdays and celebrations In thispaper the ldquothree-sigma rule of thumbrdquo [28] is used torestore the outliers according to the following formula
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 1
0 5 10 15 20Time (h)
0
1
2
3
4
Usa
ge (k
W)
Residential user 2
0 5 10 15 20Time (h)
0
05
1
15
Usa
ge (k
W)
Nonresidential user 1
0 5 10 15 20Time (h)
5
10
15
20
Usa
ge (k
W)
Nonresidential user 2
Figure 2 Load profiles for nonresidential and residential users
0 5 10 15 20 25Time (h)
0
05
1
15
2
25
Usa
ge (k
Wmiddoth
)
MondayTuesdayWednesday
ThursdayFriday
Figure 3 Daily load profiles for a customer in working days
0 5 10 15 20 25Time (h)
0
02
04
06
08
1
12
14
Usa
ge (k
Wmiddoth
)
SaturdaySundayNew year
HalloweenChristmas
Figure 4 Daily load profiles for a customer during holidays
4 Journal of Electrical and Computer Engineering
F xit1113872 1113873 avg xit1113872 1113873 + 2σ xit1113872 1113873 xit gt xlowastit
xit else
⎧⎨
⎩ (1)
where xlowastit is computed by the mean avg(middot) and standarddeviation σ(middot) for each time interval comprising theweekdaytime pair for each month
(2) Missing value imputation because of various reasonssuch as storage issues and failure of smart meters thereare missing values in electricity consumption data-rough the analysis of original data it is found thatthere are two kinds of data missing one is the con-tinuous missing of multiple data and the solution is todelete the users when the number of missing valuesexceeds 10 the other is missing single data which isprocessed by the formula (2) -us missing values canbe recovered as follows
F xit1113872 1113873
xitminus 1 + xit+1
2 xit isin NaN
xit else
⎧⎪⎪⎪⎨
⎪⎪⎪⎩
(2)
where xit stands for the electricity usage of the consumer i
over a period (eg an hour) if xit is null we represent it asNaN
(3) Data normalization finally data need to be nor-malized because the neural network is sensitive to thediverse data One of the common methods for thisscope is min-max normalization which is computedas
F xit1113872 1113873 xit minus min xiT1113872 1113873
max xiT1113872 1113873 minus min xiT1113872 1113873 (3)
where min(middot) and max(middot) represent the min and max valuesover a day respectively
4 Generation of Train and Test Datasets
After data processing the SEAI dataset contains the elec-tricity consumption data of 4737 normal electricity cus-tomers and 1200 malicious electricity customers within525 days (with 24 hours)
Our preliminary analytical results (refer to Section 31)reveal the electricity usage consumer load profiles are af-fected not only by weather and conditions but also by thetypes of consumers and other factors However it is difficultto extracting features based on experience from the 1Delectricity consumption data since the electricity con-sumption every day fluctuates in a relatively independentway Motivated by the previous work in [29] we design adeep CNN to process the electricity consumption data in a2D manner In particular we transform the 1D electricityconsumption data into 2D data according to days We define2D data as a matrix of actual energy consumption values fora specific customer where the rows of 2D data representdays as D (D 1 525) and columns represent the timeperiods as T (T 1 24) It is worthmentioning that theCNN learns the features between different hours of the dayand different days from 2D data
-en we divide the dataset into train and test sets usingthe cross-validation method in which 80 are the train setand 20 are the test set Given that electricity theftconsumers remarkably outnumber nonfraudulent onesthe imbalanced nature of the dataset can have a majornegative impact on the performance of supervised ma-chine learning methods To reduce this bias the SMOTalgorithm is used to make the number of normal andabnormal samples equal in the train set Finally the traindataset contains 7484 customers in which the number ofnormal and abnormal samples is equal And the testdataset consists of 1669 customers
5 The Novel CNN-RF Algorithm
In this section the design of the CNN-RF structure ispresented in detail and some techniques are proposed toreduce overfitting Moreover correlative parameters in-cluding the number of filters and the size of each feature mapand kernel are given Finally the training process of theproposed CNN-RF algorithm is introduced
51 CNN-RF Architecture -e proposed CNN-RF model isdesigned by integrating CNN and RF classifiers As shown inFigure 6 the architecture of the CNN-RF mainly includes anautomatic feature extractor and a trainable RF classifier -efeature extractor CNN consists of convolutional layersdownsampling layers and a fully connection layer-e CNNis a deep supervised learning architecture which usuallyincludes multiple layers and can be trained with the back-propagation algorithm It also can explore the intricatedistribution in smart meter data by performing the
0 10 20 30 40 50 60 70 80 90 100Time (days)
0
5
10
15
20
25
30
Usa
ge (k
Wmiddotd
ay)
SpringSummer
AutumnWinter
Figure 5 Four seasonsrsquo load profiles for a customer
Journal of Electrical and Computer Engineering 5
stochastic gradient method -e RF classifier consists of acombination of tree classifiers in which each tree contrib-utes with a single vote to the assignment of themost frequentclass in input data [30] In the following the design of eachpart of the CNN-RF structure is presented in detail
511 Convolutional Layer -e main purpose of convolu-tional layers is to learn feature representation of input dataand reduce the influence of noise -e convolutional layer iscomposed of several feature filters which are used tocompute different feature maps Specially to reduce theparameters of networks and lower the complication ofrelevant layers during the process of generating featuremaps the weights of each kernel in each feature map areshared Moreover each neuron in the convolutional layerconnects the local region of front layers and then a ReLUactivation function is applied on the convolved resultsMathematically the outputs of the convolutional layers canbe expressed as
yconv Xfl
l1113872 1113873 δ 1113944
Fl
fl1W
fl
l lowastXfl
l + bfl
l⎛⎝ ⎞⎠ (4)
where δ is the activation function lowast is the convolutionoperation and both W
fl
l and bfl
l are the learnable param-eters in the f-th feature filter
512 Downsampling Layer Downsampling is an importantconcept of the CNN which is usually placed between twoconvolutional layers It can reduce the number of parametersand achieve dimensionality reduction In particular eachfeature map of the pooling layer is connected to its corre-sponding feature map of the previous convolutional layerAnd the size of output feature maps is reduced withoutreducing the number of output feature maps in thedownsampling layer In the literature downsampling op-erations mainly include max-pooling and average-pooling-e max-pooling operation transforms small windows intosingle values by maximum which is expressed as formula 5but average-pooling returns average values of activations in asmall window In [31] experiments show that the max-pooling operation shows better performance than average-pooling -erefore the max-pooling operation is usuallyimplemented in the downsampling layer
ypool Xfl
l1113872 1113873 maxmisinM
Xlm1113966 1113967 (5)
where M is a set of activations in the pooling window andmis the index of activations in the pooling window
513 Fully Connected Layer With the features extracted bysequential convolution and pooling a fully connected layeris applied for flattening feature maps into one vector asfollows
yflXl( 1113857 δ Wl middot Xl + bl( 1113857 (6)
where Wl is the weight of the l-th layer and bl is the bias ofthe l-th layer
514 RF Classifier Layer Traditionally the softmax clas-sifier is used for the last output layer in the CNN but theproposed model uses the RF classifier to predict the classbased on the obtained features -erefore the RF classifierlayer can be defined as
yout XL( 1113857 sigm Wrf middot Xl + bl( 1113857 (7)
where sigm(middot) is the sigmoid function which maps ab-normal values to 0 and normal values to 1-e parameter setWrf in the RF layer includes the number of decision treesand the maximum depth of the tree which are obtained bythe grid search algorithm
52 Techniques for Selecting Parameters and AvoidingOverfitting -e detailed parameters of the proposed CNN-RF structure are summarized in Table 1 which include thenumbers of filters in each layer filter size and stride Twofactors are considered in determining the CNN structure-e first factor is the characteristics of consumersrsquo electricityconsumption behaviours Since the load files are such avariable two convolutional layers with a filter size of 3lowast 3(stride 1) are applied to capture hidden features for detectingelectricity theft Moreover because the dimension of inputdata size is 525 times 24 which is similar to what is used in imagerecognition problems two max-pooling layers with a filtersize of 2lowast 2 (stride 2) are used -e second factor is thenumber of training samples since the number of samples islimited to reduce the risk of overfitting we design a fullyconnected layer with a dropout rate of 04 whose main idea
d d
d d d d
n a n a n a n a
Input data Convolutional layer Convolutional layer
Downsamplinglayer
Downsamplinglayer
L232525 lowast 24 L332268 lowast 12 L464263 lowast 12 L564132 lowast 6L1525 lowast 24Fully connected layer
Random forest layer
L6132 lowast 1
Figure 6 Architecture of the hybrid CNN-RF model
6 Journal of Electrical and Computer Engineering
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
F xit1113872 1113873 avg xit1113872 1113873 + 2σ xit1113872 1113873 xit gt xlowastit
xit else
⎧⎨
⎩ (1)
where xlowastit is computed by the mean avg(middot) and standarddeviation σ(middot) for each time interval comprising theweekdaytime pair for each month
(2) Missing value imputation because of various reasonssuch as storage issues and failure of smart meters thereare missing values in electricity consumption data-rough the analysis of original data it is found thatthere are two kinds of data missing one is the con-tinuous missing of multiple data and the solution is todelete the users when the number of missing valuesexceeds 10 the other is missing single data which isprocessed by the formula (2) -us missing values canbe recovered as follows
F xit1113872 1113873
xitminus 1 + xit+1
2 xit isin NaN
xit else
⎧⎪⎪⎪⎨
⎪⎪⎪⎩
(2)
where xit stands for the electricity usage of the consumer i
over a period (eg an hour) if xit is null we represent it asNaN
(3) Data normalization finally data need to be nor-malized because the neural network is sensitive to thediverse data One of the common methods for thisscope is min-max normalization which is computedas
F xit1113872 1113873 xit minus min xiT1113872 1113873
max xiT1113872 1113873 minus min xiT1113872 1113873 (3)
where min(middot) and max(middot) represent the min and max valuesover a day respectively
4 Generation of Train and Test Datasets
After data processing the SEAI dataset contains the elec-tricity consumption data of 4737 normal electricity cus-tomers and 1200 malicious electricity customers within525 days (with 24 hours)
Our preliminary analytical results (refer to Section 31)reveal the electricity usage consumer load profiles are af-fected not only by weather and conditions but also by thetypes of consumers and other factors However it is difficultto extracting features based on experience from the 1Delectricity consumption data since the electricity con-sumption every day fluctuates in a relatively independentway Motivated by the previous work in [29] we design adeep CNN to process the electricity consumption data in a2D manner In particular we transform the 1D electricityconsumption data into 2D data according to days We define2D data as a matrix of actual energy consumption values fora specific customer where the rows of 2D data representdays as D (D 1 525) and columns represent the timeperiods as T (T 1 24) It is worthmentioning that theCNN learns the features between different hours of the dayand different days from 2D data
-en we divide the dataset into train and test sets usingthe cross-validation method in which 80 are the train setand 20 are the test set Given that electricity theftconsumers remarkably outnumber nonfraudulent onesthe imbalanced nature of the dataset can have a majornegative impact on the performance of supervised ma-chine learning methods To reduce this bias the SMOTalgorithm is used to make the number of normal andabnormal samples equal in the train set Finally the traindataset contains 7484 customers in which the number ofnormal and abnormal samples is equal And the testdataset consists of 1669 customers
5 The Novel CNN-RF Algorithm
In this section the design of the CNN-RF structure ispresented in detail and some techniques are proposed toreduce overfitting Moreover correlative parameters in-cluding the number of filters and the size of each feature mapand kernel are given Finally the training process of theproposed CNN-RF algorithm is introduced
51 CNN-RF Architecture -e proposed CNN-RF model isdesigned by integrating CNN and RF classifiers As shown inFigure 6 the architecture of the CNN-RF mainly includes anautomatic feature extractor and a trainable RF classifier -efeature extractor CNN consists of convolutional layersdownsampling layers and a fully connection layer-e CNNis a deep supervised learning architecture which usuallyincludes multiple layers and can be trained with the back-propagation algorithm It also can explore the intricatedistribution in smart meter data by performing the
0 10 20 30 40 50 60 70 80 90 100Time (days)
0
5
10
15
20
25
30
Usa
ge (k
Wmiddotd
ay)
SpringSummer
AutumnWinter
Figure 5 Four seasonsrsquo load profiles for a customer
Journal of Electrical and Computer Engineering 5
stochastic gradient method -e RF classifier consists of acombination of tree classifiers in which each tree contrib-utes with a single vote to the assignment of themost frequentclass in input data [30] In the following the design of eachpart of the CNN-RF structure is presented in detail
511 Convolutional Layer -e main purpose of convolu-tional layers is to learn feature representation of input dataand reduce the influence of noise -e convolutional layer iscomposed of several feature filters which are used tocompute different feature maps Specially to reduce theparameters of networks and lower the complication ofrelevant layers during the process of generating featuremaps the weights of each kernel in each feature map areshared Moreover each neuron in the convolutional layerconnects the local region of front layers and then a ReLUactivation function is applied on the convolved resultsMathematically the outputs of the convolutional layers canbe expressed as
yconv Xfl
l1113872 1113873 δ 1113944
Fl
fl1W
fl
l lowastXfl
l + bfl
l⎛⎝ ⎞⎠ (4)
where δ is the activation function lowast is the convolutionoperation and both W
fl
l and bfl
l are the learnable param-eters in the f-th feature filter
512 Downsampling Layer Downsampling is an importantconcept of the CNN which is usually placed between twoconvolutional layers It can reduce the number of parametersand achieve dimensionality reduction In particular eachfeature map of the pooling layer is connected to its corre-sponding feature map of the previous convolutional layerAnd the size of output feature maps is reduced withoutreducing the number of output feature maps in thedownsampling layer In the literature downsampling op-erations mainly include max-pooling and average-pooling-e max-pooling operation transforms small windows intosingle values by maximum which is expressed as formula 5but average-pooling returns average values of activations in asmall window In [31] experiments show that the max-pooling operation shows better performance than average-pooling -erefore the max-pooling operation is usuallyimplemented in the downsampling layer
ypool Xfl
l1113872 1113873 maxmisinM
Xlm1113966 1113967 (5)
where M is a set of activations in the pooling window andmis the index of activations in the pooling window
513 Fully Connected Layer With the features extracted bysequential convolution and pooling a fully connected layeris applied for flattening feature maps into one vector asfollows
yflXl( 1113857 δ Wl middot Xl + bl( 1113857 (6)
where Wl is the weight of the l-th layer and bl is the bias ofthe l-th layer
514 RF Classifier Layer Traditionally the softmax clas-sifier is used for the last output layer in the CNN but theproposed model uses the RF classifier to predict the classbased on the obtained features -erefore the RF classifierlayer can be defined as
yout XL( 1113857 sigm Wrf middot Xl + bl( 1113857 (7)
where sigm(middot) is the sigmoid function which maps ab-normal values to 0 and normal values to 1-e parameter setWrf in the RF layer includes the number of decision treesand the maximum depth of the tree which are obtained bythe grid search algorithm
52 Techniques for Selecting Parameters and AvoidingOverfitting -e detailed parameters of the proposed CNN-RF structure are summarized in Table 1 which include thenumbers of filters in each layer filter size and stride Twofactors are considered in determining the CNN structure-e first factor is the characteristics of consumersrsquo electricityconsumption behaviours Since the load files are such avariable two convolutional layers with a filter size of 3lowast 3(stride 1) are applied to capture hidden features for detectingelectricity theft Moreover because the dimension of inputdata size is 525 times 24 which is similar to what is used in imagerecognition problems two max-pooling layers with a filtersize of 2lowast 2 (stride 2) are used -e second factor is thenumber of training samples since the number of samples islimited to reduce the risk of overfitting we design a fullyconnected layer with a dropout rate of 04 whose main idea
d d
d d d d
n a n a n a n a
Input data Convolutional layer Convolutional layer
Downsamplinglayer
Downsamplinglayer
L232525 lowast 24 L332268 lowast 12 L464263 lowast 12 L564132 lowast 6L1525 lowast 24Fully connected layer
Random forest layer
L6132 lowast 1
Figure 6 Architecture of the hybrid CNN-RF model
6 Journal of Electrical and Computer Engineering
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
stochastic gradient method -e RF classifier consists of acombination of tree classifiers in which each tree contrib-utes with a single vote to the assignment of themost frequentclass in input data [30] In the following the design of eachpart of the CNN-RF structure is presented in detail
511 Convolutional Layer -e main purpose of convolu-tional layers is to learn feature representation of input dataand reduce the influence of noise -e convolutional layer iscomposed of several feature filters which are used tocompute different feature maps Specially to reduce theparameters of networks and lower the complication ofrelevant layers during the process of generating featuremaps the weights of each kernel in each feature map areshared Moreover each neuron in the convolutional layerconnects the local region of front layers and then a ReLUactivation function is applied on the convolved resultsMathematically the outputs of the convolutional layers canbe expressed as
yconv Xfl
l1113872 1113873 δ 1113944
Fl
fl1W
fl
l lowastXfl
l + bfl
l⎛⎝ ⎞⎠ (4)
where δ is the activation function lowast is the convolutionoperation and both W
fl
l and bfl
l are the learnable param-eters in the f-th feature filter
512 Downsampling Layer Downsampling is an importantconcept of the CNN which is usually placed between twoconvolutional layers It can reduce the number of parametersand achieve dimensionality reduction In particular eachfeature map of the pooling layer is connected to its corre-sponding feature map of the previous convolutional layerAnd the size of output feature maps is reduced withoutreducing the number of output feature maps in thedownsampling layer In the literature downsampling op-erations mainly include max-pooling and average-pooling-e max-pooling operation transforms small windows intosingle values by maximum which is expressed as formula 5but average-pooling returns average values of activations in asmall window In [31] experiments show that the max-pooling operation shows better performance than average-pooling -erefore the max-pooling operation is usuallyimplemented in the downsampling layer
ypool Xfl
l1113872 1113873 maxmisinM
Xlm1113966 1113967 (5)
where M is a set of activations in the pooling window andmis the index of activations in the pooling window
513 Fully Connected Layer With the features extracted bysequential convolution and pooling a fully connected layeris applied for flattening feature maps into one vector asfollows
yflXl( 1113857 δ Wl middot Xl + bl( 1113857 (6)
where Wl is the weight of the l-th layer and bl is the bias ofthe l-th layer
514 RF Classifier Layer Traditionally the softmax clas-sifier is used for the last output layer in the CNN but theproposed model uses the RF classifier to predict the classbased on the obtained features -erefore the RF classifierlayer can be defined as
yout XL( 1113857 sigm Wrf middot Xl + bl( 1113857 (7)
where sigm(middot) is the sigmoid function which maps ab-normal values to 0 and normal values to 1-e parameter setWrf in the RF layer includes the number of decision treesand the maximum depth of the tree which are obtained bythe grid search algorithm
52 Techniques for Selecting Parameters and AvoidingOverfitting -e detailed parameters of the proposed CNN-RF structure are summarized in Table 1 which include thenumbers of filters in each layer filter size and stride Twofactors are considered in determining the CNN structure-e first factor is the characteristics of consumersrsquo electricityconsumption behaviours Since the load files are such avariable two convolutional layers with a filter size of 3lowast 3(stride 1) are applied to capture hidden features for detectingelectricity theft Moreover because the dimension of inputdata size is 525 times 24 which is similar to what is used in imagerecognition problems two max-pooling layers with a filtersize of 2lowast 2 (stride 2) are used -e second factor is thenumber of training samples since the number of samples islimited to reduce the risk of overfitting we design a fullyconnected layer with a dropout rate of 04 whose main idea
d d
d d d d
n a n a n a n a
Input data Convolutional layer Convolutional layer
Downsamplinglayer
Downsamplinglayer
L232525 lowast 24 L332268 lowast 12 L464263 lowast 12 L564132 lowast 6L1525 lowast 24Fully connected layer
Random forest layer
L6132 lowast 1
Figure 6 Architecture of the hybrid CNN-RF model
6 Journal of Electrical and Computer Engineering
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
is that dropping units randomly from the neural networkduring training can prevent units from coadapting too much[32] and make a neuron not rely on the presence of otherspecific neurons Applying an appropriate training methodis also useful for reducing overfitting It is essentially aregularization that adds a penalty for weight update at eachiteration And we also use a binary cross entropy as the lossfunction Finally the grid search algorithm is used to op-timize RF classifier parameters such as the maximumnumber of decision trees and features
53 Training Process of CNN-RF -ere is no doubt that theCNN-RF structure needs to train the CNN at first before theRF classifier is invoked After that the RF classifier is trainedbased on features learned by the CNN
531 Training Process of CNN With a batch size (thenumber of training samples in an iteration) of 100 CNNs aretrained using the forward propagation algorithm andbackpropagation algorithm As shown in Figure 6 firstly thedata in the input layer are transferred to convolutionallayers pooling layers and a fully connected layer and thepredicted value is obtained As shown in Figure 7 thebackpropagation phase begins to update parameters if thedifference of the output value and target value is too largeand exceeds a certain threshold -e difference between theactual output of the neural network and the expected outputdetermines the adjustment direction and strip size of weightsamong layers in the network
Given a sample set D (x1 y1) (xm ym)1113864 1113865 with msamples in total 1113957y is obtained at first by using the feed-forward process As for all samples the mean of differencebetween the actual output yi and the expected output1113957ywb(xi) can be expressed by
J(W b) 1m
1113944
m
i1
12
1113957ywb xi( 1113857 minus yi
2
1113874 1113875 (8)
where w represents the connection weight between layers ofthe network and b is the corresponding bias
-en each parameter is initialized with a random valuegenerated by a normal distribution when the mean is 0 andvariance is ϑ which are updated with the gradient descentmethod -e corresponding expressions are as follows
W(l)ij W
(l)ij minus α
z
zW(l)ij
J(W b) (9)
b(l)i b
(l)i minus α
z
zb(l)i
J(W b) (10)
where α represents a learning rate W(l)ij represents the
connection weight between the i-th neuron in the l-th layerand the j-th neuron in the (l + 1)-th layer and b
(l)i represents
the bias of the i-th neuron in the l-th layerFinally all parameters W and b in the network are
updated according to formulas 9 and 10 And all these stepsare repeated to reduce the objective function J(W b)
532 Training Process of RF Classifier -e process of the RFclassifier takes advantage of two powerful machine learningtechniques bagging and random feature selection Baggingis a method to generate a particular bootstrap samplerandomly from the learnt features for growing each treeInstead of using all features the RF randomly selects a subsetof features to determine the best splitting points whichensures complete splitting to one classification of leaf nodesof decision trees Specifically the whole process of RFclassification can be written as Algorithm 1
6 Experiments and Result Analysis
All the experiments are implemented using Python 36 on astandard PC with an Intel Core i5-7500MQ CPU running at340GHz and with 80 GB of RAM-e CNN architecture isconstructed based on TensorFlow [33] and the interfacebetween the CNN and the RF is programmed using scikit-learn [34]
61 Performance Metrics In this paper the problem ofelectricity theft is considered a discrete two-class classifi-cation task which should assign each consumer into one ofthe predefined classes (abnormal or normal)
In particular the result of classifier validation is oftenpresented as confusion matrices Table 2 shows a confusionmatrix applied in electricity theft detection where TP FNFP and TN respectively represent the number of con-sumers that are classified correctly as normal classifiedfalsely as abnormal classified falsely as normal andclassified correctly as abnormal Although these four indicesshow all of the information about the performance of theclassifier more meaningful performance measurescan be extracted from them to illustrate certain performancecriteria such as TPR TP(TP + FN) FPR FP(FP +TN) precisionTP(TP+ FP) recallTP(TP+ FN)and F1 score 2(precision times recall)(precision + recall)where the true positive rate (TPR) is the ability of theclassifier to correctly identify a consumer as normal alsoreferred to as sensitivity and the false positive rate (FPR) isthe risk of positively identifying an abnormal consumerPrecision is the ratio of consumers detected correctly to thetotal number of detected normal consumers Recall is theratio of the number of consumers correctly detected to theactual number of consumers (Figure 7)
In addition the ROC curve is introduced by plotting allTPR values on the y-axis against their equivalent FPR valuesfor all available thresholds on the x-axis -e ROC curve atthe upper left has better detection effects where lower FPRvalues are caused by the same TPR -e AUC indicates the
Table 1 Parameters of the proposed CNN-RF model
Layer Size of the filter Number of filters StrideConvolution1 3lowast 3 32 1MaxPooling1 2lowast 2 mdash 2Convolution2 3lowast 3 64 1MaxPooling2 2lowast 2 mdash 2
Journal of Electrical and Computer Engineering 7
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
quality of the classifier and the larger AUC represents betterperformance
62 Experiment Based onCNN-RFModel -e experimentis conducted based on normalized consumption dataWith the batch size of 100 the training procedure isstopped after 300 epochs As shown in Figure 8 it could beseen that training and testing losses settled downsmoothly -is means that the proposed model learnedtrain dataset and there was small bias in the test datadataset as well -en the RF classifier is the last layer of theCNN to predict labels of input consumers Forty valuesfrom the fully connected layer of the trained CNN are usedas a new feature vector to detect abnormal users and arefed to the RF for learning and testing To build the RF inthe hybrid model the optimal parameters are determinedby applying the grid search algorithm based on the
training dataset -e grid searching range of each pa-rameter is given as follows numTrees [50 60 100]
and max depth [10 20 40] 6 times 4 24 are tried with
Require(1) Original training datasetsS (xi yi) i 1 2 3 m1113864 1113865 (X Y) isin Re times R
(2) Test datasets xj isin Rm
for d 1 to Ntree do(1) Draw a bootstrap Sd from the original training data(2) Grow an unpruned tree hd using data Sd
(a)Randomly select new feature set Mtry from original feature set e
(b)Select the best features from feature set Mtry based on Gini indicator on each node(c)Split until each tree grows to its maximum
end forEnsure(1) -e collection of trees hd d 1 2 Ntree1113864 1113865
(2) For xj the output of a decision tree is hd(xj)
f(xj) majority vote hd(xj)1113966 1113967d
Ntreereturn f(xj)
ALGORITHM 1 Random forest algorithm
x Wconv Wpool J (W b)
bx
bconv
bpool
(a)
x Wconv Wpool J (W b)
bx
bconv
bpool
(b)
Figure 7 Process of forward propagation (a) and backpropagation (b)
0 50 100 150 200 250 300Epoch
03
035
04
045
05
055
06
Usa
ge (k
W)
TrainingTesting
Figure 8 Loss curves of training and testing
Table 2 Confusion matrix applied in electricity theft detection
Actualdetected Normal AbnormalNormal TP (true positive) FN (false negative)Abnormal FP (false positive) TN (true negative)
8 Journal of Electrical and Computer Engineering
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
different combinations -e best result is achieved whennumTrees 100 and maxDepth 30 -ese parametersare then used to train the hybrid model
-e classification results of the detection model pro-posed in this paper are as follows the number of TPs FNsFPs and TNs is 1041 28 23 and 577 respectively-erefore the precision recall and F1 score can be calcu-lated as shown in Table 3 where Class 0 is an abnormalpattern and Class 1 is a normal pattern Also the AUC valueof the CNN-RF algorithm is calculated which is 098 and farbetter than the value of the baseline model (AUC 05) -ismeans the proposed algorithm is able to classify both classesaccurately as shown in Figure 9
63 Comparative Experiments and the Analysis of ResultsTo evaluate the accuracy of the CNN-RF model comparativeexperiments are conducted based on handcrafted featureswith nondeep learning methods including SVM RF gradientboosting decision tree (GBDT) and logistic regression (LR)Moreover we compared the obtained classification results byvarious supervised classifiers CNN features with the SVMclassifier (CNN-SVM) and CNN features with the GBDTclassifier (CNN-GBDT) to those obtained by the previousclassification work In the following five methods are in-troduced respectively and then the results are analysed
(1) Logistic regression it is a basic model in binaryclassification that is equivalent to one layer of neuralnetwork with sigmoid activation function A sigmoidfunction obtains a value ranging from 0 to 1 based onlinear regression -en any value larger than 05 willbe classified as a normal pattern and less than 05 willbe classified as an abnormal pattern
(2) Support vector machine this classifier with kernelfunctions can transform a nonlinear separableproblem into a linear separable problem by pro-jecting data into the feature space and then findingthe optimal separate hyperplane It is applied topredict the patterns of consumers based on theaforementioned handcrafted features
(3) Random forest it is essentially an integration ofmultiple decision trees that can achieve better per-formance when maintaining the effective control ofoverfitting in comparison with a single decision tree-e RF classifier also can handle high-dimensionaldata while maintaining high computationalefficiency
(4) Gradient boosting decision tree it is an iterativedecision tree algorithm which consists of multipledecision trees For the final output all results orweights are accumulated in the GBDT while randomforests use majority voting
(5) Deep learning methods softmax is used in the lastlayer of the CNN CNN-GBDT and CNN-SVM incomparison with the proposed method
Table 4 summarizes the parameters of comparativemethods All experiments have been done Figure 10 shows
the results of different methods and proposed hybrid CNN-RF method and it can be seen that the AUC value of CNN-RF CNN-GBDT CNN-SVM CNN SVM RF LR andGBDT is 099 097 098 093 077 091 063 and 077 AndFigure 11 presents the results of all comparative experimentsin terms of precision recall and F1 score Among theperformances of eight different electricity theft detectionalgorithms the deep learning methods (such as CNN CNN-RF CNN-SVM and CNN-GBDT) show better perfor-mances than machine learning methods (eg LR SVMGBDT and RF)-e reason of this result is that the CNN canlearn features through a large number of electricity con-sumption data -us the accuracy of electricity theft de-tection is greatly improved In addition it is indicated thatthe proposed CNN-RF model shows the best performancecompared with other methods
Table 3 Classification score summary for CNN-RF
Precision Recall F1 scoreClass 0 097 096 096Class 1 098 098 098Averagetotal 097 097 097
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN-RF) = 099
Figure 9 ROC curve for CNN-RF
Table 4 Experimental parameters for comparative methods
Methods Input data Parameters
LR Features(1D)
Penalty L2Inverse of regulation strength 10
SVM Features(1D)
Penalty parameter of the error term 105Parameter of kernel function (RBF)
00005
GBDT Features(1D) -e number of estimators 200
RF Features(1D) -e number of trees in the forest 100
CNN Raw data(2D)
-e max depth of each tree 30-e same as the CNN component of the
proposed approach
Journal of Electrical and Computer Engineering 9
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
As detecting electricity theft consumers is achieved bydiscovering the anomalous consumption behaviours ofsuspicious customers we can also regard it as an anomalydetection problem essentially Since CNN-RF has shownsuperior performance on the SEAI dataset it would beinteresting to test its generalization ability on other datasetsIn particular we have further tested the performance ofCNN-RF together with the above-mentioned models onanother dataset Low Carbon London (LCL) dataset [35]-e LCL dataset is composed of over 5500 honest electricityconsumers in total each of which consists of 525 days (the
sampling rate is hour) Among the 5500 customers werandomly chose 1200 as malicious customers and modifiedtheir intraday load profiles according to the fraudulentsample generation methods similar to the SEAI dataset
After the preprocess of the LCL dataset (similar to theSEAI dataset) the train dataset contains 7584 customers inwhich the number of normal and abnormal samples is equalAnd the test dataset consists of 1700 customers together withcorresponding labels indicating anomaly or not -en thetraining of the proposed CNN-RF model and comparativemodels is all done following the aforementioned procedure
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T LR RF
SVM
Different algorithms
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
PrecisionRecallF1 score
Figure 11 Comparison with other algorithms based on the SEAIdataset
0 01 02 03 04 05 06 07 08 09 1FPR
0
01
02
03
04
05
06
07
08
09
1
TPR
AUC (CNN) = 093AUC (CNN-RF) = 099AUC (CNN-GBDT) = 097AUC (CNN-SVM) = 098
AUC (GBDT) = 077AUC (LR) = 063AUC (RF) = 091AUC (SVM) = 077
Figure 10 ROC curves for eight different algorithms based on theSEAI dataset
TPR
FPR
AUC (CNN) = 095AUC (CNN-RF) = 097AUC (CNN-GBDT) = 094AUC (CNN-SVM) = 096
AUC (GBDT) = 074AUC (LR) = 068AUC (RF) = 077AUC (SVM) = 071
0
01
02
03
04
05
06
07
08
09
1
0 01 02 03 04 05 06 07 08 09 1
Figure 12 ROC curves for eight different algorithms based on theLCL dataset
0
01
02
03
04
05
06
07
08
09
1
Perfo
rman
ce m
etric
CNN
CNN
-RF
CNN
-GBD
T
CNN
-SV
M
GBD
T
SVM
Different algorithms
LR RF
PrecisionRecallF1 score
Figure 13 Comparison with other algorithms based on the LCLdataset
10 Journal of Electrical and Computer Engineering
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
based on the train dataset -e performances of all themodels on the test dataset are shown in Figures 12 and 13
Figures 12 and 13 show that the performances of CNN-SVM CNN-GBDT and CNN-RF are better than those ofother models on the whole Since they are deep learningmodels the distinguishable features can be learned directlyfrom raw smart meter data by the CNN which has equippedthem with better generalization ability In addition theCNN-RF model has achieved the best performance since theRF can handle high-dimensional data while maintaininghigh computational efficiency
7 Conclusions
In this paper a novel CNN-RF model is presented to detectelectricity theft In this model the CNN is similar to anautomatic feature extractor in investigating smart meter dataand the RF is the output classifier Because a large number ofparameters must be optimized that increase the risk ofoverfitting a fully connected layer with a dropout rate of 04is designed during the training phase In addition the SMOTalgorithm is adopted to overcome the problem of dataimbalance Some machine learning and deep learningmethods such as SVM RF GBDT and LR are applied to thesame problem as a benchmark and all those methods havebeen conducted on SEAI and LCL datasets -e resultsindicate that the proposed CNN-RF model is quite apromising classification method in the electricity theft de-tection field because of two properties -e first is thatfeatures can be automatically extracted by the hybrid modelwhile the success of most other traditional classifiers relieslargely on the retrieval of good hand-designed featureswhich is a laborious and time-consuming task -e secondlies in that the hybrid model combines the advantages of theRF and CNN as both are the most popular and successfulclassifiers in the electricity theft detection field
Since the detection of electricity theft affects the privacyof consumers the future work will focus on investigatinghow the granularity and duration of smart meter data mightaffect this privacy Extending the proposed hybrid CNN-RFmodel to other applications (eg load forecasting) is a taskworth investigating
Data Availability
Previously reported [SEAI][LCL] datasets were used tosupport this study and are available at [httpwwwucdieissdadata][httpsbetaukdataserviceacukdatacataloguestudiesstudyid=7857] -ese prior studies (and datasets)are cited at relevant places within the text as references[19 26]
Conflicts of Interest
-e authors declare that there are no conflicts of interest
Acknowledgments
-is work was supported by the National Key Research andDevelopment Program of China (2016YFB0901900) the
Natural Science Foundation of Hebei Province of China(F2017501107) Open Research Fund from the State KeyLaboratory of Rolling and Automation Northeastern Uni-versity (2017RALKFKT003) the Foundation of Northeast-ern University at Qinhuangdao (XNB201803) and theFundamental Research Funds for the Central Universities(N182303037)
References
[1] S S S R Depuru L Wang and V Devabhaktuni ldquoElectricitytheft overview issues prevention and a smart meter basedapproach to control theftrdquo Energy Policy vol 39 no 2pp 1007ndash1015 2011
[2] J P Navani N K Sharma and S Sapra ldquoTechnical and non-technical losses in power system and its economic conse-quence in Indian economyrdquo International Journal of Elec-tronics and Computer Science Engineering vol 1 no 2pp 757ndash761 2012
[3] S McLaughlin B Holbert A Fawaz R Berthier andS Zonouz ldquoA multi-sensor energy theft detection frameworkfor advanced metering infrastructuresrdquo IEEE Journal on Se-lected Areas in Communications vol 31 no 7 pp 1319ndash13302013
[4] P McDaniel and S McLaughlin ldquoSecurity and privacychallenges in the smart gridrdquo IEEE Security amp PrivacyMagazine vol 7 no 3 pp 75ndash77 2009
[5] T B Smith ldquoElectricity theft a comparative analysisrdquo EnergyPolicy vol 32 no 1 pp 2067ndash2076 2004
[6] J I Guerrero C Leon I Monedero F Biscarri andJ Biscarri ldquoImproving knowledge-based systems with sta-tistical techniques text mining and neural networks for non-technical loss detectionrdquo Knowledge-Based Systems vol 71no 4 pp 376ndash388 2014
[7] C C O Ramos A N Souza G Chiachia A X Falcatildeo andJ P Papa ldquoA novel algorithm for feature selection usingharmony search and its application for non-technical lossesdetectionrdquo Computers amp Electrical Engineering vol 37 no 6pp 886ndash894 2011
[8] P Glauner J A Meira P Valtchev R State and F Bettingerldquo-e challenge of non-technical loss detection using artificialintelligence a surveyficial intelligence a surveyrdquo In-ternational Journal of Computational Intelligence Systemsvol 10 no 1 pp 760ndash775 2017
[9] S-C Huang Y-L Lo and C-N Lu ldquoNon-technical lossdetection using state estimation and analysis of variancerdquoIEEE Transactions on Power Systems vol 28 no 3pp 2959ndash2966 2013
[10] O Rahmati H R Pourghasemi and A M Melesse ldquoAp-plication of GIS-based data driven random forest and max-imum entropy models for groundwater potential mapping acase study at Mehran region Iranrdquo CATENA vol 137pp 360ndash372 2016
[11] N Edison A C Aranha and J Coelho ldquoProbabilisticmethodology for technical and non-technical losses estima-tion in distribution systemrdquo Electric Power Systems Researchvol 97 no 11 pp 93ndash99 2013
[12] J B Leite and J R S Mantovani ldquoDetecting and locating non-technical losses in modern distribution networksrdquo IEEETransactions on Smart Grid vol 9 no 2 pp 1023ndash1032 2018
[13] S Amin G A Schwartz A A Cardenas and S S SastryldquoGame-theoretic models of electricity theft detection in smartutility networks providing new capabilities with advanced
Journal of Electrical and Computer Engineering 11
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
metering infrastructurerdquo IEEE Control Systems Magazinevol 35 no 1 pp 66ndash81 2015
[14] A H Nizar Z Y Dong Y Wang and A N Souza ldquoPowerutility nontechnical loss analysis with extreme learning ma-chine methodrdquo IEEE Transactions on Power Systems vol 23no 3 pp 946ndash955 2008
[15] J Nagi K S Yap S K Tiong S K Ahmed andMMohamadldquoNontechnical loss detection for metered customers in powerutility using support vector machinesrdquo IEEE Transactions onPower Delivery vol 25 no 2 pp 1162ndash1171 2010
[16] E W S Angelos O R Saavedra O A C Cortes andA N de Souza ldquoDetection and identification of abnormalitiesin customer consumptions in power distribution systemsrdquoIEEE Transactions on Power Delivery vol 26 no 4pp 2436ndash2442 2011
[17] K A P Costa L A M Pereira R Y M NakamuraC R Pereira J P Papa and A Xavier Falcatildeo ldquoA nature-inspired approach to speed up optimum-path forest clusteringand its application to intrusion detection in computer net-worksrdquo Information Sciences vol 294 pp 95ndash108 2015
[18] R R Bhat R D Trevizan X Li and A Bretas ldquoIdentifyingnontechnical power loss via spatial and temporal deeplearningrdquo in Proceedings of the IEEE International Conferenceon Machine Learning and Applications Anaheim CA USADecember 2016
[19] M Ismail M Shahin M F Shaaban E Serpedin andK Qaraqe ldquoEfficient detection of electricity theft cyber attacksin ami networksrdquo in Proceedings of the IEEE Wireless Com-munications and Networking Conference Barcelona SpainApril 2018
[20] RMehrizi X Peng X Xu S Zhang DMetaxas and K Li ldquoAcomputer vision based method for 3D posture estimation ofsymmetrical liftingrdquo Journal of Biomechanics vol 69 no 1pp 40ndash46 2018
[21] K He X Zhang S Ren and J Sun ldquoDelving deep intorectifiers surpassing human-level performance on imagenetclassificationrdquo in Proceedings of the IEEE Conference onComputer for Sustainable Global Development pp 1026ndash1034New Delhi India March 2015
[22] A Sharif Razavian H Azizpour J Sullivan and S CarlssonldquoCNN features off-the-shelf an astounding baseline for rec-ognitionrdquo Astrophysical Journal Supplement Series vol 221no 1 pp 806ndash813 2014
[23] Z Zheng Y Yang X Niu H-N Dai and Y Zhou ldquoWide anddeep convolutional neural networks for electricity-theft de-tection to secure smart gridsrdquo IEEE Transactions on IndustrialInformatics vol 14 no 4 pp 1606ndash1615 2018
[24] D Svozil V Kvasnicka and J Pospichal ldquoIntroduction tomulti-layer feed-forward neural networksrdquo Chemometricsand Intelligent Laboratory Systems vol 39 no 1 pp 43ndash621997
[25] Irish social science data archive httpwwwucdieissdadatacommissionforenegryregulationcer
[26] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016
[27] Y Wang Q Chen D Gan J Yang D S Kirschen andC Kang ldquoDeep learning-based socio-demographic in-formation identification from smart meter datafication fromsmart meter datardquo IEEE Transactions on Smart Grid vol 10no 3 pp 2593ndash2602 2019
[28] V Chandola A Banerjee and V Kumar ldquoAnomaly de-tection a surveyrdquo ACM Computing Surveys vol 41 no 32009
[29] H-T Cheng L Koc J Harmsen et al ldquoWide amp deep learningfor recommender systemsrdquo in Proceedings of the 1stWorkshopon Deep Learning for Recommender SystemsmdashDLRS 2016pp 7ndash10 Boston MA USA September 2016
[30] E Feczko N M Balba O Miranda-Dominguez et alldquoSubtyping cognitive profiles in autism spectrum disorderusing a functional random forest algorithmfiles in autismspectrum disorder using a functional random forest algo-rithmrdquo NeuroImage vol 172 pp 674ndash688 2018
[31] Y-L Boureau J Ponce and Y LeCun ldquoA theoretical analysisof feature pooling in visual recognitionrdquo in Proceedings of the27th International Conference on Machine Learningpp 111ndash118 Haifa Israel June 2010
[32] N Srivastava G Hinton A Krizhevsky I Sutskever andR Salakhutdinov ldquoDropout a simply way to prevent neuralnetworks from overfittingrdquo Ce Journal of Machine LearningResearch vol 15 no 1 pp 1929ndash1958 2014
[33] M Abadi A Agarwal P Barham et al ldquoTensorflow Large-scale machine learning on heterogeneous distributed sys-temsrdquo 2016 httpsarxivorgabs16030446
[34] F Pedregosa G Varoquaux A Gramfort et al ldquoScikit-learnmachine learning in pythonrdquo Journal of Machine LearningResearch vol 12 no 4 pp 2825ndash2830 2011
[35] J R Schofield ldquoLow carbon london project data from thedynamic time-of-use electricity pricing trialrdquo 2017 httpsdiscoverukdataserviceacukcataloguesn7857
12 Journal of Electrical and Computer Engineering
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom