Upload
lamkhanh
View
218
Download
0
Embed Size (px)
Citation preview
• Data IQ reports that 92% of companies suspect that their customer
data is inaccurate
• Data is the lifeblood of an organisation, and quality data is crucial to
the business
Why invest in Data Quality?
“By 2017, 33% of Fortune 100 organizations will experience an information crisis, due to their inability to effectively value, govern and trust their
enterprise information.” Source: Gartner
The 1-10-100 rule is a concept used to quantify the costs of quality issues.
DATA QUALITY is not a project, it’s a Lifestyle.
Costs of Poor Address Data Quality
Administrative Costs
Labor Cost
Hidden Costs
Visible Costs
Deliveries Delayed
Data Quality Inspection Costs
Customer Services • Customer Dissatisfaction • Complaint Handling Costs • Customer Retention Costs
Management • Poor Informed Decision Making • Company Reputational Damage
Data Analytics • Inaccurate Analysis
Sales • Communication Costs • Inaccurate Sales Pipeline • Loss of Potential Revenue
Finance • Overdue Receivables • Pricing or Billing Error
Inventory Carrying Costs
Transportation & Warehousing Costs
Operations • Longer Orders Processing Cycle • Lower Competitive Advantages
Data Quality Management Cycle
Assess data
quality
Clean data
Improve business process
Monitor data
quality
1.
2.
3.
4.
Customer Name AddressPostal
Code
Fullmark Pte Ltd 10 Soon Lee Rd, Fullmark Industrial Bldg, #03-02 628074
United Overseas Bank Limited 80 Raffles Place UOB Plaza 738205
AFPD Pte Ltd 10 Tampines Industrial Ave 3 Singapore 528758
Company Name Addr1 Addr2 Addr3 Zip Code
UOB Attn: Mr. Lee Kok Keong 451 Clementi Avenue 3 Tel: 91208372/63491077 S'pore 120451
Keppel Land Limited Bugis Junction Towers #15-05 230 Victoria St. Singapore 188024
FMC Technologies P/L 40 Scotts Rd, S'pore 228231 Attn: Karen Chia (IT Dept) Tel: +65 8739 1366
1. Duplicated data 2. Incomplete or missing data 3. Outdated contact information 4. Dirty data 5. Inconsistent format from multiple data sources
File 1
Common Data Challenges
File 2
Job Title • Job Level • Department
Email Domain • Valid Email • Corporate Domain
Phone Numbers • Direct Line • Mobile
Company Name Format*
Industry and Sub-Industry
Employee Size
Annual Revenue
Phone Numbers • Mainline • Fax Number
Address • Individual Segment • Full Address
Data Standardisation
Company Level Contact Level
*standardised during de-duplication
Company Name Standardisation
10
Country Company Name Corrected Company Name
Indonesia PT Excelcom (Axiata Group Berhad) TBK Excelcom (Axiata Group Berhad) TBK, PT
Indonesia PT TBK Excelcom (Axiata Group Berhad) Excelcom (Axiata Group Berhad) TBK, PT
Indonesia PT Nusa Network Prakarsa Nusa Network Prakarsa, PT
Indonesia Nusa Network Prakarsa PT. Nusa Network Prakarsa, PT
Formatting Company Names (English and Non-English Company Type) E.g. Indonesian companies have a ‘PT’ appended at the back of the name
Standardisation Address Fields
11
• Standardise and parse unstructured address into individual address segments.
• Combine the segmented addresses “House No. + Street + City/Country” to get precise GPS coordinates from geocoding API.
Address 1 Address 2 Address 3
The Plaza Office Tower 42nd Floor Jl. MH. Thamrin Kav. 28-30 Jakarta 10350
Jl. P. Bunaken A3, Lantai 2, Kawasan Industri Medan 3 Kota Medan, 20242, Indonesia
House No. Street Floor Building City Ward Postal
Code Country
Kav. 28-30 Jl. MH. Thamrin 42nd Floor The Plaza Office Tower Jakarta 10350
Jl. P. Bunaken A3 Lantai 2 Medan Kawasan Industri
Medan 3 20242 Indonesia
Jl. MH. Thamrin Kav. 28-30 Jakarta Latitude: -6.1969939 Longitude: 106.8204433
Jl. P. Bunaken A3 Medan Indonesia Latitude: 3.6760964 Longitude: 98.7025541
Data Cleansing Steps
Company Details
• Mainline and Fax • Address • Industry • Employee Size • Annual Revenue
Standardize data fields
Company Matching
Master Database
Standardized fields Matching priority fields Matching Confidence
Fields Merging Based on Data Priorities
Map parent /child relationships
Map Contact IDs
Company Deduplication
Merge company information
Map company records
Data Matching Methodology Natural Language Processing & Rule Based Algorithm
Exact Match
Remove Extra Space & Insert Missing Space – 2 Levels
Remove Special Characters (. , ’) – 3 Levels
Standardize Keywords (Pte. Co.,) – 7 Levels
Single & Double Letters Typo Errors
100% probability
50% probability
What’s in an Account Name?
Entered Value
Gillette Singapore - HQ
Gillettes Singapore - Operations Office
Unilever Singapore Pte Ltd
Unilever (SG) Pte Ltd
Unilever P/L
UOB (Ang Mo Kio Branch)
UOB - Serangoon Outlet
Input by Sales Rep
Common Trend: Append extra account site information in the Account Name
Account Name Deduplication System Assigns Matched Probability
15 levels of name matching rules with varying matching confidence
Entered Value Normalised Name Proposed Match Value Match Prob (%)
Match Remark
Gillette Singapore - HQ Gillette Singapore Gillette Singapore Proposed master name
Gillettes Singapore - Operations Office Gillettes Singapore Gillette Singapore 50% Single letter typo
Unilever Singapore Pte Ltd Unilever Singapore Pte Ltd Unilever Singapore Pte Ltd Proposed master name
Unilever (SG) Pte Ltd Unilever (Singapore) Pte Ltd Unilever Singapore Pte Ltd 100% Exact Match
Unilever P/L Unilever Pte Ltd Unilever Singapore Pte Ltd 90% Remove country
UOB (Ang Mo Kio Branch) UOB UOB Proposed master name
UOB - Serangoon Outlet UOB UOB 100% Exact Match
Data Merging
Account Name Address Phone Number Last Modified Date
Unilever Singapore Pte Ltd 20 Pasir Panjang Road +65 6643 3000 12-Dec-2015
Unilever (SG) Pte Ltd 20 Pasir Panjang Rd, #06-22 Mapletree Business City
+65 6643 3001 25-Mar-2016
Account Name Address Phone Number
Unilever Singapore Pte Ltd 20 Pasir Panjang Rd, #06-22 Mapletree Business City
+65 6643 3000, +65 6643 3001
Sample Merging Rule: Address: Level of completeness* Phone Number: Merge all unique values
*Address Standardisation is a pre-requisite
Final Cleansed Record
Different merging treatment across all data fields Merge specific data fields in accordance with business methodology and
predefined merging rules
Process Improvement – Scenario 1 Master Data Linking for Single View
Customer Relationship Management
(CRM)
Enterprise Resource Planning
(ERP)
Supply Chain
Management (SCM)
Marketing
Automation (MA)
Single View
Data Standardisation Data Deduplication Data Merging
Accurate Data Analysis & Reporting
Cleansed data
channeled back to
respective applications
Ent-Vision
North East Team 1
Central Team 2
• Standard Route Planning (By Zones) Handled by Team 1 & Team 2 • Dynamics Route Planning (By Proximity) Handled by Team 1 based on nearest delivery locations which is more cost-effective
Process Improvement – Scenario 2 Dynamics Route Planning
Process Improvement – Scenario 3 Optimise Sales Route Planning With GeoAnalytics
Monday Tuesday Wednesday Thursday Friday
Process Improvement – Scenario 4 Sharpen Strategic Location Selection for New Service Centers
Service Center
5 km
Covered Customer Area
Uncovered Customer Area 580
5 km
5 km
4,800 5 km
769
5 km
1,125
400
5 km