12
De-duplication of Master Data during large SAP Implementation Projects Abstract Introduction Overview of de-duplication process Advantages of de-duplication process Execution of de-duplication process Initial cleansing of source data Source master data comparison Rules used in de-duplication logic Master data de-duplication table Nomination of leading master data New master data creation in the target system Mapping of non-leading master data to leading master data Impact on the transactional data migration Project team structure for the de-duplication projects Pitfalls and the mitigation plan in de-duplication projects Conclusion Abstract During large SAP Implementation project, which happens by consolidating one or more SAP and non-SAP systems, there is a high possibility of same material master and vendor master to re-appear in

Implementation or Rollout de Duplication of Master Data

  • Upload
    sreeram

  • View
    214

  • Download
    2

Embed Size (px)

Citation preview

De-duplication of Master Data during large SAP Implementation Projects

Abstract Introduction Overview of de-duplication process Advantages of de-duplication process Execution of de-duplication process Initial cleansing of source data Source master data comparison Rules used in de-duplication logic Master data de-duplication table Nomination of leading master data New master data creation in the target system Mapping of non-leading master data to leading master data Impact on the transactional data migration Project team structure for the de-duplication projects Pitfalls and the mitigation plan in de-duplication projects ConclusionAbstractDuring large SAP Implementation project, which happens by consolidating one or more SAP and non-SAP systems, there is a high possibility of same material master and vendor master to re-appear in the target system with different names and details. Incorrect master data leads to various issues like incorrect reporting. This whitepaper addresses the issue of duplicate records and provides solution on how to eliminate them. This whitepaper delves into the advantages of de-duplication, explains the process steps to execute de-duplication along with information on fields which should be used to pick the duplicate records, suggests the best team structure to manage de-duplication projects and provides a guide explaining the common pitfalls and the mitigation plans while executing de-duplication projects.

IntroductionPerforming master data cleansing at source systems (SAP and non-SAP systems) and at the intermediary stage prior to final data upload into the target system is a major activity in SAP implementation and rollout projects. This is a key activity while consolidating one or more ERP systems. The primary activity in master data cleansing is de-duplication of master data. De-duplication of master data refers to the process of finding the identical master data within or across the source system(s) and eliminating them before migrating to the target system. Master data here typically refers to material master data and vendor master data.There are several tools used in de-duplication projects. The capabilities of such tools are not discussed in this whitepaper.The aim of this whitepaper is to provide key information of the de-duplication process which could be followed in SAP rollout and implementation projects.Overview of de-duplication processThe following picture depicts the de-duplication process

Figure 1: Overview of de-duplication processThe de-duplication process comprises following steps:1. Initial cleansing of source data2. Source data comparison based on de-duplication logic specific to master data and preparation of de-duplication reports3. Nomination of leading master data4. Mapping of non-leading master data to leading master data5. New master data creation in the target systemAdvantages of de-duplication processThe advantages of executing de-duplication process during rollout/implementation projects are listed as follows:1. Duplicate material and vendor master records leads to incorrect and inconsistent reporting2. Incorrect consumption and availability information of material master leads to inaccurate material planning3. De-duplication process adhered during the start of the rollout/implementation projects reduces the time and cost spent in manually identifying duplicates later4. De-duplication process improves the overall reliability of material and vendor reports & analysis5. Removing duplicate vendor master records helps to maintain effective and consistent communication with vendors6. Consolidated, consistent, harmonized and cleansed master data are pre-requisites for innovation and growthExecution of de-duplication processDe-duplication process gets executed as described in following steps:Initial cleansing of source dataThe scope for de-duplication of master data comprises all the master data in the source systems except those which would fall under one or more below criteria:a. Master data which are deleted or blocked at the highest level in the organisational structure. However master data which are blocked at one of the lower organisational structure level might still be active and relevant at another organisational level, hence those data should still be considered for de-duplication exercise.b. Vendor master data which are not created at company code level but created only at purchasing organisational level for different purposesc. Master data which cannot be migrated to target system due to non-availability of mapping values on important fields like unit of measure in material masterThe initial cleansing of source master data is important since this would dramatically reduce the number of duplicated group of master records. Initial cleansing would include both enrichment of key master data and performing corrections in the master data. Predominantly, initial cleansing will be performed on the data within the source system only.Few examples of initial cleansing of source master data are as follows:a. Correct the material master with dummy textb. Correct material master with material type HERS duplicated with the material type HIBE to which the HERS material master is assigned toc. Update key details used in de-duplication logic which are missing in the material and vendor master datad. Check and correct redundant partner functions created in vendor master Important fields which are focused during initial cleansing or enrichment within source system in material master and vendor master are as follows:Material Master:a. Material descriptionb. Unit of measurec. Manufacturer detailsd. UNSPSC codee. Vendor part numberVendor Master:a. Vendor nameb. Addressc. VAT registration numberd. DUNSe. Bank account numberSource master data comparisonThis step is the core of de-duplication process. After the data is cleansed and enriched, it would be compared against each other. The result of this comparison would be the grouping of similar master data. The source system for the master data could be single ERP system or multiple ERP systems. There are tools available for comparing the master data and creating the groups of similar master data.The de-duplication tool applies the de-duplication logic in order to identify similar master data and develops the master data de-duplication table.Rules used in de-duplication logicThe criteria used to determine the group of similar master data would depend on many factors like the availability of data, level of initial cleansing done, scope of enrichment performed within source data, etc.,Some of the rules used in material master de-duplication logic in order to identify the similar group of material masters are as follows:a. Same manufacturer and same base unit of measureb. Same UNSPSC codec. Same manufacturer and same vendor part numberd. Similar descriptionPredominantly the details would be concatenated and text comparisons are performed in order to arrive at the similar master data groups.Likewise, some of the rules used in vendor master de-duplication logic would be as follows:a. Same DUNS numberb. Same bank detailsc. Same tax coded. Same address details like name, street number, PO box, Postal code, etc.,Master data de-duplication tableMaster data de-duplication table is the result of initial data cleansing activity and the application of de-duplication logic on the pre-cleansed source data. De-duplication tool has the capability in order to identify group of master records within single ERP system or across multiple ERP systems based on the de-duplication logic.Simple example of master data de-duplication table would appear as follows:Table 1 Master Data De-Duplication TableGroup numberSource System NameMaterialMaterial DescriptionManufacturerVendor part number

1SAP System 1Material ABolt HydraulicMnfr 19N-4524

1SAP System 1Material BBoltMnfr 19N4524

1Non-SAP system 2Material CBolt, long Oil hydrMnfr 19N-45-24

In the above example the de-duplication logic has worked on the source system data and grouped these three materials which are of similar nature.Nomination of leading master dataOnce the grouping of similar master data has been done, there is need to select the material which should get migrated into the target system.Table 2 Master Data De-Duplication Table Appended with Nomination Columns

Group numberSource System NameMaterialMaterial DescriptionManufacturerVendor part numberLeading MaterialNon-leading Material

1SAP System 1Material ABolt HydraulicMnfr 19N-4524Yes

1SAP System 1Material BBoltMnfr 19N4524Yes

1Non-SAP system 2Material CBolt, long Oil hydrMnfr 19N-45-24Yes

In the above example if Material A is identified as the leading material which should be migrated into the target system and the other two, Material B and Material C are identified as non-leading materials and they are duplicates. The non-leading materials which are the duplicates will not be migrated to the target system. Leading material is also referred as parent material and non-leading material is also referred as child material.In the above example, refer to the column Vendor part number. The same part number provided by the same manufacturer was created in two different systems in three different ways. Hence text search logic like normalization of the text (removing the special characters to determine the actual text) should be implemented to determine the duplicates.The selection of leading/non-leading material is a manual activity which should be guided by few principles as follows:If a group contains master data from two different systems, then there is conflict of which system specific master data is given first preference to be selected as leading material. Here normally the thumb rule is to have the oldest system master data which has the updated information to get the first preference. The other approach would be to select the master data which has most transactional data. This issue becomes complex (during the selection of the leading material) when the different systems are owned by different internal organizations. Normally the de-duplication process should be carried out centrally with central co-coordinator to mitigate conflicts arising out of selecting the leading material in the groups.New master data creation in the target systemDuring this process step we should have arrived with all the manual nominations of identifying the leading and non-leading master data. This would enable us to segregate the leading material which will get migrated to the target system. The leading material which would get migrated to the target system would have new material number in the target system as per the target system material number nomenclature.Mapping of non-leading master data to leading master dataAs a final process step within the de-duplication process, after receiving the new material number in the target system, we should arrive at the mapping of leading and non-leading master data number (old source system material number) to the leading master data number (new material number in the target system) which would appear as per above example in section 3.3 as follows:Table 3 Master Data Mapping Table

Old source system material numberNew material number in the target system

Material ANew Material A

Material BNew Material A

Material CNew Material A

The Old source system material number contains both the leading (parent) and the non-leading (child) master data number.New material number in the target system contains the material number which is created in the target system.The non-leading (child) material and vendor inherits the leading (parent) material and vendor master data. Certain data like bank details of child vendor will be consolidated to the parent vendor. Child vendor will inherit parent vendor general data.All non-leading (child) vendors company code/purchasing org/plant will be extended to the parent vendor in the target system.Impact on the transactional data migrationThe non-leading master data will not be migrated to the target system. The transactional data of the non-leading master data would be created using the equivalent mapped leading master data.Project team structure for the de-duplication projectsDe-duplication project involves lot of coordination between different owners of the source systems. Usually in projects staggered across geographies, there would be separate team responsible for each company code relevant master data. In all such scenarios, there should be de-dup coordinator in each location who should liaise with other de-dup coordinators and the central de-dup coordinator. The master data organisation should provide the high level governance. It is beneficial to position central de-duplication co-ordinator centrally across geographies.Table 4 Project Team Structure

RolesMajor responsibilities

Central de-duplication co-ordinator1. Provide technical guidance for doing the leading/non-leading master data nominations2. Co-ordinate leading/non-leading master data nominations3. Issue and scope management4. Leadership activities5. Arrange recurring meetings to track progress6. Solve conflicts

De-duplication co-ordinator in every company code/ system/geography1. Perform the leading/non-leading master data nominations2. Participate in recurring meetings3. Ensure data quality in the source system

Master data organisation1. Governance on master data2. Provide clarification on the master data design3. Review data quality of source system and implement required structural changes

Pitfalls and the mitigation plan in de-duplication projectsThe common pitfalls and the mitigation plans in de-duplication projects are as follows:Table 5 Pitfalls and Mitigation Plans

PitfallsMitigation plan

During initial review stages, there is likely underestimation of resources needed to review the items and perform the nomination for leading / non-leading master dataAs a general guideline, the resource estimate should be based on 100 master data a week per person. This is based on author's experience in de-duplication project and this is with complete analysis including investigation of purchasing history

If there are multiple system owners, time taken to reach consensus on nominating the leading master data was hugeThe role of central coordinator and the authority should be more so that conflicts could be settled easily

Incorrect nominations leads to complexities, when the group contains master data across systemsResources involved in de-duplication project should have detailed knowledge on master data and de-duplication process

Incomplete nominations leads to complexities, , when the group contains master data across systemsTracking mechanism to determine which master data nomination is pending with which team

High risk (or) high value itemsHigh risk and (or) high value items should be approached with caution

ConclusionThis whitepaper discusses the de-duplication process during the initial stages of rollout/implementation projects. However once the parent master data is identified and the new master data are created after eliminating the child duplicates, it is imperative to have defined approach to avoid duplicates further in the target system. There could be single source for master data creation and changes along with effective rules & processes to prevent duplicates at source.