Upload
vanphuc
View
218
Download
0
Embed Size (px)
Citation preview
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 1Public
Improved Asset Data Quality
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 2Public
Agenda
Why is data quality key?
What is data quality?
How to improve?
How to execute data quality projects?
How to guarantee quality?
What is the business case for Alliander?
2
Why is data quality key?
3
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 4Public
We want data at our fingertips
External view / customers:
When is an outage over?
Where can I find what they are doing in my
street?
When do they come to my home for the
repairs?
Internal view / grid operator:
What does a field worker have to take with
him to repair an outage?
How to optimize work force planning?
What are the main risk factors in the grid?
4
What is data quality?
5
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 6Public
How is data quality defined?
Data quality
Completeness Correctness Consistency
How to improve it?
7
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 8Public
Which data?
Divide the subject into smaller pieces
8
AssetLegal
Finance Asset healthSafety &
environment
Network & network-
management
Customer Work planning Remainder
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 9Public
What is important?
The 5 sec rule:
Define in 5 sec the important
objects
Define in 5 sec the important
attributes
9
Asset
Object Y
Attribuut 1
Attribuut 2
Attribuut n
Object X
Attribute 1
Attribute 2
Attribute n
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 10Public
What is the norm for completeness?
Define minimum levels for data completeness (norms)
Set the norm for each object
10
Asset
Object Y Object Y Object X Object X
Object X Object C
Object BObject B Object B
Object CObject C
Object AObject B
Object C
Object AObject A
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 11Public
What is the norm for correctness?
11
Background
check
ME
ID-check
PassportIntroduction
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 12Public
What is the norm for correctness?
Define the level of correctness for the data (norms)
Set the norm for each object
12
Data
mining
Correctness
Expert
knowledge
Visual
inspections
Data is validate by 3 of 3
Data is validate by 2 of 3
Data is validate by 1 of 3
Regular process of new data
Data with unknown origin
Wrong data
How to execute data quality
projects?
13
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 14Public
How to solve data quality issues?
The most challenging task is to correct data quality issues
Extract data,
identify issuesCorrect issues
Approve and
change data
???
STANDARD SOLUTION AVAILABLE STANDARD SOLUTION AVAILABLE
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 15Public
Data mining-based approach to correct data quality issues is powerful
How to correct data quality issues?
Computing
power
Effort
Manual cleansing Record by record, inspections
Effort: #data records × #attributes
Expert/Rule-based Use expert knowledge to derive rules
Effort: #attributes × #rules
Data mining-based Automatically derive rules using statistics
Review rules and results with experts
Effort: Smaller than #attributes
Intellectual
property
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 16Public
How to correct data quality issues?
Most efficient approach is to combine manual and expert/rule-based
cleansing with data mining
Data
mining
Expert/
Rules
Manual
clean-
sing
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 17Public
Business problem
Number of phases of
connection points is
partially missing
Red: One phase
Blue: Three phases
Gray rectangles: Phase
information missing3 phases
3 phases1 phase
1 phase
???
??????
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 18Public
Collect data about connection points in a finger print
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900009726493 25 A No Apartment 10 100% 1
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900009730377 25 A No Apartment 10 60% 3
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900100096846 ??? No Apartment 10 60% ???
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 19Public
Analytic table
Data about many connections
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900009726493 25 A No Apartment 10 100% 1
871685900009730377 25 A No Apartment 10 60% 3
871685920000769466 40 A No Apartment 10 28% 1
871685900010940246 25 A No Apartment 3 12% 3
871685900100198441 35 A No Business & Ap. 7 9% 3
871685920000547606 40 A Yes Apartment 10 0% ???
871685900002665362 ??? ??? Business 8 55% ???
871685900010837782 40 A No Other 9 50% ???
871685900100096846 ??? No Apartment 10 60% ???
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 20Public
Analytic table
Split finger print into explaining fields and target field
EAN
871685900009726493
871685900009730377
871685920000769466
871685900010940246
871685900100198441
871685920000547606
871685900002665362
871685900010837782
871685900100096846
Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
25 A No Apartment 10 100%
25 A No Apartment 10 60%
40 A No Apartment 10 28%
25 A No Apartment 3 12%
35 A No Business & Ap. 7 9%
40 A Yes Apartment 10 0%
??? ??? Business 8 55%
40 A No Other 9 50%
??? No Apartment 10 60%
Number of
phases
1
3
1
3
3
???
???
???
???
Explaining fields Target field
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 21Public
Analytic table
Statistical training of a function from explaining fields to target field
Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
25 A No Apartment 10 100%
25 A No Apartment 10 60%
40 A No Apartment 10 28%
25 A No Apartment 3 12%
35 A No Business & Ap. 7 9%
Number of
phases
1
3
1
3
3
40 A Yes Apartment 10 0%
??? ??? Business 8 55%
40 A No Other 9 50%
??? No Apartment 10 60%
???
???
???
???
Explaining fields Target field
f
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 22Public
Analytic table
Use trained function to predict missing data
Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
25 A No Apartment 10 100%
25 A No Apartment 10 60%
40 A No Apartment 10 28%
25 A No Apartment 3 12%
35 A No Business & Ap. 7 9%
Number of
phases
1
3
1
3
3
40 A Yes Apartment 10 0%
??? ??? Business 8 55%
40 A No Other 9 50%
??? No Apartment 10 60%
3
3
1
3
Explaining fields Target field
f
Probability that
a prediction is
correct is also
included
Can e.g. be used
to prioritize
review of data
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 23Public
Prediction of missing data
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900009726493 25 A No Apartment 10 100% 1
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900009730377 25 A No Apartment 10 60% 3
EAN Amperage Auto fuse Connection
type
Decade
constructed
Fraction of
1-phase
neighbors
Number of
phases
871685900100096846 ??? No Apartment 10 60% 3
What is the business case?
25
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 26Public
Fast learning curvedue to efficient knowledge transfer
between experts and data miners
Lessons learned to improve the
data quality
Prerequisite: Data ownership in the business
Visual
inspect
ions
Expert
know-
ledge
Data
mining
Advantages of triangle approach
Higher data qualitybecause validity of data is examined
in up to three different ways
Cheaper, more insightcompared to the traditional way
© 2014 SAP SE or an SAP affiliate company. All rights reserved. 27Public
Data maturity
Business cases for a better data qualityB
us
ine
ss
va
lue
Better data
quality
More accurate
asset condition
More accurate
asset valuation
Better integration of
IT systems
Risk reduction
Better transparency for
customers about outages
and repairs
Lower annual spend
for asset investments
approx. by 1% to 2%
Reduce investments
in asset infrastructure
and maintenance
Better planning of
future workloads
Thank you
Contact information:
F name MI. L name
Title
Address
Phone number