18
Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project by SKENDER KOLLCAKU Milan, 07/2017 keywords: iPaaS, data integration, Talend, Salesforce, data-driven, use case, migration, cloud computing, SaaS, CRM, database, real-time, open-source, java, professional services, on-premise, mainframe, data quality, hybrid, repository, metadata, reusable job, data validation, bi-directional sync, design pattern, agile, business, ETL, project management, customer,

Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Embed Size (px)

Citation preview

Page 1: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Orchestrate data with agility and responsiveness.

Learn how to manage a common data integration project

by SKENDER KOLLCAKU

Milan, 07/2017

keywords:

iPaaS, data integration, Talend, Salesforce, data-driven, use case, migration, cloud computing, SaaS, CRM, database, real-time, open-source, java, professional services, on-premise, mainframe, data quality, hybrid, repository, metadata, reusable job, data validation, bi-directional sync, design pattern, agile, business, ETL, project management, customer,

Page 2: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

The scenario: Manage a typical data integration project

Consider the following business requirements:

Manage successfully and keep on track the project considering budget, cost,

time and stakeholders’concerns.

(1) Provide a customers data migration from a mainframe to a Cloud SaaS

CRM (Salesforce: https://www.salesforce.com) respecting Address

format/values according to some business requirements

(2) Set up a bi-directional integration between two systems

(3) Identify what added values the integration and data-driven culture make

available

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

2

Page 3: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Agenda

Data availability, iPaaS and why data-driven culture is the new norm fororganizations

Data asset requires Governance, but also Agility and Responsiveness

Define a roadmap to manage and close successfully the project (businesscase)

How to identify business-related data and valuable Customers records

Talend (https://www.talend.com/) as the unified leader platform for thesolution

Data validation and initial load (migration as the pattern design)

Bi-directional synchronization to automate jobs in real-time

Added values and future implementations

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

3

Page 4: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

The importance of data availability

Data is one of the most important assets an organization has because it defines each organization’s uniqueness.

Being a data-driven organization is not the final objective,

but it represents a crucial process

in the innovation challenge.

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

4

Page 5: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Data requires Governance, but also Agility and Responsiveness

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

5

Collaborate in an open

manner

BE AGILE AND ADAPT TO CHANGES

Agility

Start with business-

related data

FAST TIME TO MARKET

Share processto engage

InspirethroughTalend

SHARE, DEMOCRATIZE AND INSPIRE FOR THE FUTURE

Short and fast deliveries

Page 6: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

3-steps Project plan starts comunicating with the stakeholders

(1) Comunicate with decision-making players

(2) Identifycandidate data for business-relatedvalue

(3) Model and implement design pattern for the specific process

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

6

Page 7: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Talend is the leading open source integration software provider to data-driven enterprises

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

7

Open-Source leader

Eclipse-similar IDE unified platform

Java-based code generator

Visual job design

Graphical business process modeller

(100% graphical)

Smart productsubscription

Big Data native in real-time

Reusable metadata elements

+1000 built-in

drag & drop connectors and

components

Page 8: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Determine customers containing business-related value

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

8

Prospects/Leads (potential Customers)

Filter by fastest closed deals

Particular industry (life science, manufacturing or finance...)

Recent closed deals (filter by time range)

Largest revenue generated streams

Interested geographical area

Page 9: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

(1) Project phase: initial load prior ETL operations

Once available the input flat files from the mainframe, the ETL (Extract, Transform and Load) operations to be executed could be the following:

Cleanse

Validate

Format

Unify

Standardize

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

9

pull data from MF

cleanse, validate, format

unify or standardize

provision DB schema compatibility

upload into SaaSCRM

Page 10: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Data quality includes data validation

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

10

DA

TA V

ALI

DA

TIO

N

NULL HANDLING

STRING HANDLING

DATE HANDLING

THIRD-PARTY VALIDATION LIBRARIES

Talend Data Preparation self-service free tool

Page 11: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Business process model definition before westart implementing the job

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

11

Use of Talend DI canvas to model the business process. Flow of data will satisfy thefollowing business requirement: only matched/validated Customers address records willbe loaded into the SaaS CRM.

Page 12: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Use Talend to set up the data migration betweenOn-Premise input files and target SaaS CRM object(Account in Salesforce)

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

12

prior to Addressvalidation

Page 13: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Simplified job which uses tMap “magical” component to validate Address

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

13

Simplified job which uses tMap component to validate Customer address.The output are (1) loaded into Salesforce Account object as records and (2) rejectedCustomers with invalid addresses in an Excel spreadsheet for future analysis

Page 14: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

(2) Project phase: bi-directionalsynchronization between mainframe and SaaS

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

14

Talend built-in component tSalesforceGetUpdated_1 used for tracking changes (update,insert, upsert) in the Salesforce Account object and propagate them in real-time to a DB2mainframe’s table. This component can work in background given a past Start and Endtime range.Another mechanism is the CDC (Change Data Capture).

Page 15: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Bi-directional integration means real-time synchronization between the two databases

There are some key issues to consider:

How similar are the schemas of the databases to be kept in sync (this helps for

eventual JOIN operations)?

How often do the databases need to be synched (performance query…)?

How will we resolve situations in which the same data has been modified in both

of databases since the last sync session (conflict based on the “record owner” or

“last modified” solution to be described)?

How much effort and/or money are we willing to invest in developing our sync

system (“keep project budget on track”)?

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

15

Page 16: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

(3) Added values: technical perspective

External lookup with any other data sources (supply chain, e-commerce, BI (analysis of ROIs, deals/opportunities), DW, Marketing, social networks activity/engagement, distributed and cross-platform applications… )

Reusable jobs, thanks to repository metadata

Versioning of the Java generated code (Github, Maven…)

Statistical reports about job execution (performance)

Other applications can trigger the job (example: collecting data for reports and dashboards…)

Unified and scalable integration platform (Data Preparation, DI, Cloud integration, ESB, MDM, Big Data, Fabric…)

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

16

Page 17: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

(3) Added values: business perspective

Give real value to the data asset (“enable data-driven organizations”)

Support for decisions (“how to use the information obtained?”) and providethem in advance (apply automatically and review rules regularly)

Remove data management risk when modernizing systems

Consolidate applications

Smooth subscription model (start with free open-source tool and thenupgrade in a predictable fashion depending on business needs – pay only forthe number of developers…)

Optimize processes by keeping comprehensive, relevant and consistent data everywhere.

Deliveries in real-time and analytics prediction!

Big Data native suite of products

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

17

Page 18: Orchestrate data with agility and responsiveness. Learn how to manage a common data integration project

Thank you!

"Orchestrate data with agility and responsiveness" - by Skender Kollcaku

18