Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Data Mining techniques application in Power Distribution Utilities
Authors:Sérgio RamosZita A. Vale
GECAD – Knowledge Engineering and Decision Support R esearch GroupEngineering Institute of Porto – Polytechnic Institu te of Porto
IEEE PES TD 2008Chicago, IL, USA22th April 2008
IEEE PES TD 2008
INTRODUCTION
Content of Presentation:
� Electricity Market Liberalization Environment
� MV Costumers Characterization – An overview
� Data Mining Techniques Application
� Clustering and Consumers Characterization• Case Study
� Future Work
2
IEEE PES TD 2008
ELECTRICITY MARKET LIBERALIZATION
� Total freedom in choosing the electricity supplier
� Consumers and suppliers are exposed to price risk
� Distribution and retail companies are looking for better tariff rates
� Competitive environment among retail companies to sell the electricity
3
IEEE PES TD 2008
ELECTRICITY MARKET LIBERALIZATION
� Increase of demand elasticity due to the electricity price volatility
� Electricity customers more concerned about their consumption behaviour
� Knowledge about customers’ daily load profile is essential for leadership in this new context
� Deeper relationship between Customer and Electricity Supplier
4
ElectricitySupplier
Consumer
IEEE PES TD 2008
ELECTRICITY MARKET LIBERALIZATION
CONSUMERS’ CHARACTERIZATION� Advantages:
�Design of new tariffs, contracts, products and services
�Creation of incentive actions to the energy efficiency
5
IEEE PES TD 2008
MV COSTUMERS CHARACTERIZATION
DETERMINATION AND CHARACTERIZATION OF MV CONSUMERS LOAD PROFILE USING DATA MINING
TECHNIQUES
- Rule Sets- Decision Tree- Overall Accuracy
Relationship consumer/Electricity Supplier
Data BaseData Mining Techniques
New TariffSchedules
ClustersData Pre-
processing
Formatted Data
ClassificationModel
- C5.0 Class. Algor.- Shape indices
ClusteringAlgorithms
- Two-Step- K-Means- SOM
Typical LoadProfile
6
IEEE PES TD 2008
LOAD STUDY
Data description:
� Sample of 229 MV Consumers
� Collect period of data(3 months in the Summer / Winter – from the Portuguese Distribution Company)
� Consumed power recorded with a cadence of 15 minutes
� 96 values obtained per day
ll (m(m)) = {l= {l 11(m)(m), , …… , l, l9696
(m)(m)} } with m = number of customerswith m = number of customers
7
IEEE PES TD 2008
DATA PREPARATION
Data-Cleaning
� 21 customer’s files were discarded� Some damaged files were detected� Customers without registered values� 208 customers remained to be analyzed
� To estimate missing values of measures a multi laye r perceptron – MLP – artificial neural net was used
� The errors of the metered load curves are attenuate d without making significant alterations in the real measures
8
IEEE PES TD 2008
DATA PREPARATION
Pre – processing data:
� The Power consumption was normalized to the [0-1] range
� A representative load diagram has been built for each customer by averaging the related load diagrams
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
9
TREATED
DATA
WEEK
DAY - YEAR
WEEKEND
DAY - YEAR
REPRESENTATIVE DIAGRAMS:
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Time (h)
Pow
er(p
.u.)
IEEE PES TD 2008
CLUSTERING PROCESS
REPRESENTATIVELOAD DIAGRAMS
REPRESENTATIVELOAD DIAGRAMS
CLUSTERSCLUSTERS
NUMBEROF
CLUSTERS
NUMBEROF
CLUSTERS
- TWO-STEP- K-MEANS
- SOM
ClusteringPerformanceComparison
10
Choiceof the bestAlgorithm
IEEE PES TD 2008
CLUSTERING PROCESS
Mean Index Adequacy (MIA):
Cluster Dispersion Index (CDI):
∑
∑ ∑
=
= =
=K
k
k
K
k
n
n
kmk
RrdK
LldnK
CDI
k
1
)(2
1 1
)()(2)(
),(2
1
),(.2
11)(
),(1 )(
1
)(2 kK
k
k LrdK
MIA ∑=
×=
Choice of the Clustering Algorithm:
11
IEEE PES TD 2008
CLUSTERING PROCESS
Comparison of the clustering performance:Two-Step Cluster AnalysisK-meansKohonen Net – Self Organizing Features Maps
Two-StepK-means
SOM
MIA
CDI0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
MIA
CDI
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
3 6 9 12 15 clusters
MIA
CDI
12
IEEE PES TD 2008
TWO-STEP CLUSTERING APLICATION� Using the Two-step cluster algorithm the clusters were
obtained using the representative load diagrams
13
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Working days Time (h)
Pow
er (
p.u.
)
IEEE PES TD 2008
REPRESENTATIVE LOAD DIAGRAMS
� Representative diagram for each cluster
14
Weekend days
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Time (h)
Pow
er (p.
u.)
Work days
0,0
0,10,2
0,30,4
0,5
0,60,7
0,80,9
1,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Time (h)
Pow
er (p
.u.)
IEEE PES TD 2008
CLASSIFICATION MODEL
Objective:
� To build a classification model, that applied to ne w unclassified records, will allow to foresee the cla ss to which it belongs
� In the future it will allow to attribute to each ne w consumer the consumption profile that best represents it
15
IEEE PES TD 2008
CLASSIFICATION MODEL
C5.0 Algorithm
� Decision Tree:• Application simplicity• Result in tree form• Generation of rules
Derive from the daily load diagrams� Give information about:
• The daily load curve shape• The consumption pattern of each consumer
dayav
day
P
Pf
,
min,3 =
day
dayav
P
Pf
max,
,1 =day
day
P
Pf
max,
min,2 =
dayav
nightav
P
Pf
,
,
3
14 =
dayav
lunchav
P
Pf
,
,
8
15 =
16
IEEE PES TD 2008
CLASSIFICATION MODEL
Classification module framework:
17
CLASSIFICATION MODEL
- C5.0 Classification Algorithm
REPRESENTATIVELOAD DIAGRAMS
GENERATION OF RULESDECISION TREE
LOAD SHAPE INDEXES(Each representative load curve is represented by a set of load shape
indexes)
[ ]654321 ,,,,, fffffff =
- INPUT ATTRIBUTES: VECTOR {f}- TEST SET- TRAINING SET- TEN-FOLD CROSS VALIDATION
- EVALUATION ACCURACY
- ANALYSIS OF THE CONFUSION MATRIX
IEEE PES TD 2008
CLASSIFICATION
� Rule set for the working days classification model:
� Overall accuracy:
if f 3 ≤ 0,48 and f 2 ≤ 0,13 and f 5 ≤ 0,55 and f 1 ≤ 0,35 and f 4 ≤ 0,31 then cluster 8
if f 3 ≤ 0,48 and f 2 ≤ 0,13 and f 5 ≤ 0,55 and f 1 ≤ 0,35 and f 4 > 0,31 then cluster 9
if f 3 ≤ 0,48 and f 2 ≤ 0,13 and f 5 ≤ 0,55 and f 1 > 0,35 then cluster 5
if f 3 ≤ 0,48 and f 2 ≤ 0,13 and f 5 > 0,55 and f 2 ≤ 0,06 then cluster 7
if f 3 ≤ 0,48 and f 2 ≤ 0,13 and f 5 > 0,55 and f 2 > 0,06 then cluster 6
if f 3 ≤ 0,48 and f 2 > 0,13 and f 4 ≤ 0,24 then cluster 4
if f 3 ≤ 0,48 and f 2 ≤ 0,13 and f 4 > 0,24 then cluster 5
if f 3 > 0,48 and f 3 ≤ 0,78and f 2 ≤ 0,44 then cluster 3
if f 3 > 0,48 and f 3 ≤ 0,78and f 2 > 0,44 then cluster 2
if f 3 > 0,48 and f 3 > 0,78 then cluster 1
94,83%
18
IEEE PES TD 2008
CONSUMER-SUPPLIER RELATIONSHIP
� The Knowledge can be used by the Retail Companies
• Identify diagrams’ peaks
• Develop specific consumer's contracts
• Optimization of the offers of electric power purchase
19
IEEE PES TD 2008
CONSUMER-SUPPLIER RELATIONSHIP
� The Knowledge can be used by the electric power consumers
• Choice of the electricity supplier with the best tariff schedule proposal
• Modulation of their electric consumption habits
• In the execution of electric energy interruption contracts
20
IEEE PES TD 2008
FURTHER WORK
� Compare the efficiency of the C5.0 algorithm with different classification algorithms
� The design of new prices categories in order to adequately adapt the tariff schedules to the cluster consumption pattern
� Formulation of new tariffs schedules in articulation with electricity markets
21
Data Mining techniques application in Power Distribution Utilities
Authors:Sérgio RamosZita A. Vale
GECAD – Knowledge Engineering and Decision Support R esearch GroupEngineering Institute of Porto – Polytechnic Institu te of Porto
IEEE PES TD 2008Chicago, IL, USA22th April 2008