Upload
skender-kollcaku
View
198
Download
1
Embed Size (px)
Citation preview
Orchestrate data with agility and responsiveness.
Learn how to manage a common data integration project
by SKENDER KOLLCAKU
Milan, 07/2017
keywords:
iPaaS, data integration, Talend, Salesforce, data-driven, use case, migration, cloud computing, SaaS, CRM, database, real-time, open-source, java, professional services, on-premise, mainframe, data quality, hybrid, repository, metadata, reusable job, data validation, bi-directional sync, design pattern, agile, business, ETL, project management, customer,
The scenario: Manage a typical data integration project
Consider the following business requirements:
Manage successfully and keep on track the project considering budget, cost,
time and stakeholders’concerns.
(1) Provide a customers data migration from a mainframe to a Cloud SaaS
CRM (Salesforce: https://www.salesforce.com) respecting Address
format/values according to some business requirements
(2) Set up a bi-directional integration between two systems
(3) Identify what added values the integration and data-driven culture make
available
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
2
Agenda
Data availability, iPaaS and why data-driven culture is the new norm fororganizations
Data asset requires Governance, but also Agility and Responsiveness
Define a roadmap to manage and close successfully the project (businesscase)
How to identify business-related data and valuable Customers records
Talend (https://www.talend.com/) as the unified leader platform for thesolution
Data validation and initial load (migration as the pattern design)
Bi-directional synchronization to automate jobs in real-time
Added values and future implementations
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
3
The importance of data availability
Data is one of the most important assets an organization has because it defines each organization’s uniqueness.
Being a data-driven organization is not the final objective,
but it represents a crucial process
in the innovation challenge.
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
4
Data requires Governance, but also Agility and Responsiveness
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
5
Collaborate in an open
manner
BE AGILE AND ADAPT TO CHANGES
Agility
Start with business-
related data
FAST TIME TO MARKET
Share processto engage
InspirethroughTalend
SHARE, DEMOCRATIZE AND INSPIRE FOR THE FUTURE
Short and fast deliveries
3-steps Project plan starts comunicating with the stakeholders
(1) Comunicate with decision-making players
(2) Identifycandidate data for business-relatedvalue
(3) Model and implement design pattern for the specific process
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
6
Talend is the leading open source integration software provider to data-driven enterprises
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
7
Open-Source leader
Eclipse-similar IDE unified platform
Java-based code generator
Visual job design
Graphical business process modeller
(100% graphical)
Smart productsubscription
Big Data native in real-time
Reusable metadata elements
+1000 built-in
drag & drop connectors and
components
Determine customers containing business-related value
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
8
Prospects/Leads (potential Customers)
Filter by fastest closed deals
Particular industry (life science, manufacturing or finance...)
Recent closed deals (filter by time range)
Largest revenue generated streams
Interested geographical area
(1) Project phase: initial load prior ETL operations
Once available the input flat files from the mainframe, the ETL (Extract, Transform and Load) operations to be executed could be the following:
Cleanse
Validate
Format
Unify
Standardize
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
9
pull data from MF
cleanse, validate, format
unify or standardize
provision DB schema compatibility
upload into SaaSCRM
Data quality includes data validation
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
10
DA
TA V
ALI
DA
TIO
N
NULL HANDLING
STRING HANDLING
DATE HANDLING
THIRD-PARTY VALIDATION LIBRARIES
Talend Data Preparation self-service free tool
Business process model definition before westart implementing the job
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
11
Use of Talend DI canvas to model the business process. Flow of data will satisfy thefollowing business requirement: only matched/validated Customers address records willbe loaded into the SaaS CRM.
Use Talend to set up the data migration betweenOn-Premise input files and target SaaS CRM object(Account in Salesforce)
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
12
prior to Addressvalidation
Simplified job which uses tMap “magical” component to validate Address
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
13
Simplified job which uses tMap component to validate Customer address.The output are (1) loaded into Salesforce Account object as records and (2) rejectedCustomers with invalid addresses in an Excel spreadsheet for future analysis
(2) Project phase: bi-directionalsynchronization between mainframe and SaaS
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
14
Talend built-in component tSalesforceGetUpdated_1 used for tracking changes (update,insert, upsert) in the Salesforce Account object and propagate them in real-time to a DB2mainframe’s table. This component can work in background given a past Start and Endtime range.Another mechanism is the CDC (Change Data Capture).
Bi-directional integration means real-time synchronization between the two databases
There are some key issues to consider:
How similar are the schemas of the databases to be kept in sync (this helps for
eventual JOIN operations)?
How often do the databases need to be synched (performance query…)?
How will we resolve situations in which the same data has been modified in both
of databases since the last sync session (conflict based on the “record owner” or
“last modified” solution to be described)?
How much effort and/or money are we willing to invest in developing our sync
system (“keep project budget on track”)?
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
15
(3) Added values: technical perspective
External lookup with any other data sources (supply chain, e-commerce, BI (analysis of ROIs, deals/opportunities), DW, Marketing, social networks activity/engagement, distributed and cross-platform applications… )
Reusable jobs, thanks to repository metadata
Versioning of the Java generated code (Github, Maven…)
Statistical reports about job execution (performance)
Other applications can trigger the job (example: collecting data for reports and dashboards…)
Unified and scalable integration platform (Data Preparation, DI, Cloud integration, ESB, MDM, Big Data, Fabric…)
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
16
(3) Added values: business perspective
Give real value to the data asset (“enable data-driven organizations”)
Support for decisions (“how to use the information obtained?”) and providethem in advance (apply automatically and review rules regularly)
Remove data management risk when modernizing systems
Consolidate applications
Smooth subscription model (start with free open-source tool and thenupgrade in a predictable fashion depending on business needs – pay only forthe number of developers…)
Optimize processes by keeping comprehensive, relevant and consistent data everywhere.
Deliveries in real-time and analytics prediction!
Big Data native suite of products
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
17
Thank you!
"Orchestrate data with agility and responsiveness" - by Skender Kollcaku
18