26
2933 West Germantown Pike Building 2, Suite 204 Fairview Village, Pa 19409 Toll Free : 800 618- 0836 Fax : 610 666-1006 Email : [email protected] Toll Free : 800 618- 0836 Fax : 610 666-1006 Email : [email protected] June 2011 Data Integration Study and Results: ETL Versus Cloud Based Data Integration Christopher C. Biddle

IdealNet Data Integration ETL vs Cloud

  • View
    126

  • Download
    0

Embed Size (px)

DESCRIPTION

This presentation reviews existing and upcoming Data Integration technologies

Citation preview

Page 1: IdealNet Data Integration ETL vs Cloud

2933 West Germantown PikeBuilding 2, Suite 204Fairview Village, Pa 19409USA

Toll Free : 800 618-0836Fax : 610 666-1006Email : [email protected]

Toll Free : 800 618-0836Fax : 610 666-1006Email : [email protected]

June 2011

Data Integration Study and Results:ETL Versus Cloud Based Data Integration

Christopher C. Biddle

Page 2: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 2

Taking Your Business to the Next Level™

IdealNet, Inc.Page 2

Taking Your Business to the Next Level™

Agenda

• IdealNet Corporate Overview

• Data Integration Study – Background & Drivers

• State of Data Integration Market

• Customer Challenges

• Comparison Categories– Business Application– Platform Deployment– Connectivity to Data Sources– Synchronization– Transformation– Data Movement– Test, Development and Operations Environments– Data Modeling– Data Quality & Data Governance– Architecture and Standards

• Conclusions

Agenda

Page 3: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 3

Taking Your Business to the Next Level™

IdealNet, Inc.Page 3

Taking Your Business to the Next Level™

IdealNet Corporate Overview

• Ideal Systems, Inc. was formed in 1994 : provided Enterprise Software and consulting

• IdealNet, Inc. was formed in 2002 and absorbed the Ideal Systems, Inc. consulting practice

• Provides expert Business and Technical consulting services to Life Sciences and Financial Institution customers

• Clients include all of the top 10 Tier 1 Pharmaceutical Manufacturers, Leading Financial Institutions including Investment Banks and Commercial Banks, Biotech, Medical Device, Hospital Group Purchasing Organizations (GPOs), and Distributors

• Areas of expertise include Commercial and Regulatory Contracting, Finance, Merger & Acquisition Support, Trading Strategies, Business Intelligence & Analytics, and Sales Force Automation

• Application Development and Customization, Master Data Management and Application Integration

• Many years of experience in implementing, integrating, and upgrading Enterprise Software solutions

• Based on the East Cost in a Philadelphia, PA suburb with clients throughout the world

Page 4: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 4

Taking Your Business to the Next Level™

IdealNet, Inc.Page 4

Taking Your Business to the Next Level™

IdealNet Partial Client List

Page 5: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 5

Taking Your Business to the Next Level™

IdealNet, Inc.Page 5

Taking Your Business to the Next Level™

Background & Drivers

• User Issues– Applications are growing & complexity is

increasing– Bridging these islands is becoming

increasingly more important– Number of “Bridges” required is growing in

leaps and bounds – it’s a mess– Need for unifying key data across applications

such as customer, product, human relations and vendor data

– On-premise and Cloud are very different– Cloud is coming to large enterprise (with the

right security) as Salesforce® leads the– Compliance is always an issue with any public

company – ETL makes it hard– Strong ROI associated with the results – so

users go forward

Page 6: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 6

Taking Your Business to the Next Level™

IdealNet, Inc.Page 6

Taking Your Business to the Next Level™

1st Generation ETL

Relational

Manual Discovery

“A” to “B” Pipe

4:1

2nd Generation ETL with GUI

Wire Diagrams

Relational

Manual Discovery

Simple Connectors

“A” to “B” Pipe(s)

MDM Hub

4:1

Some Cloud Apps

3rd Generation Persistent Metadata Server – Hub & Spoke

Data Virtualization Technology

Automated Discovery

Automated Operations

Relational, Object, XML, NoSQL

No Programming

Virtual MDM™

2:1 to perhaps 1:1

Full Cloud & On-Premise

Data Integration Technology Curve1985

Dominant Technology in Use Today

2000 2010

Technology Advances

Page 7: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 7

Taking Your Business to the Next Level™

IdealNet, Inc.Page 7

Taking Your Business to the Next Level™

Terminology

• Terminology Varies Considerably– ETL (extract/transform/load) – this is the most consistent term– Data Integration, EAI (transaction level), EII (failed in the market - extinct), ESB– Data Virtualization (instance), Advanced Data Virtualization (persistent

metadata server)– Data Alignment, Data Synchronization, Data Harmonization– Master Data Management, Virtual Master Data Management– Cloud: Iaas (integration as a service), Iaas (information as a service), iPaas

(information platform as a service)

Page 8: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 8

Taking Your Business to the Next Level™

IdealNet, Inc.Page 8

Taking Your Business to the Next Level™

Market Size - Billions of Dollars

• IDC® - Data integration and Access Software 2009– Published December 2010– 2010 Projected - $3,450 billion

• Overall 7.8% cagr thru 2014 reaching total market size of $4,759 billion

• Gartner ®– Magic Quadrant for Data Integration Tools– Published November 2010– 2010 Projected - $1,431 billion (up 6%)

• Overall 9.4% cagr thru 2014 reaching total market size of $2,000 billion

• Forrester ®– Information As A Service - Q1 1020– Published February 2010– “The current IaaS market size is $3,300 billion”

• “and is likely to grow to more than $6,700 billion by 2012, a 40% annual growth rate”

• MDM market tracked separately by most – $1- 2 billion perhaps – this market suffered a major setback over last 24 months– Virtual MDM™ is coming

• Gartner® DataQuest – April 30 Report on IT Services– $73 billion in IT services dollars spent worldwide in 2009 … how much is data

integration?

Page 9: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 9

Taking Your Business to the Next Level™

IdealNet, Inc.Page 9

Taking Your Business to the Next Level™

Market Metrics and Trends

• Gartner® – “35% of all large and midsize organizations worldwide will be using one or

more iPaaS (cloud based integration Platform as a Service ) offerings in some form by 2016.” – 2011

• Enterprise Strategy Group® – “Data integration platforms must be able to integrate various types of data in

the cloud, along with data on-premises and in remote locations.” – June 2011• Forrester®

– “Integration by ETL creates data quality problems and delays information delivery.” – June 15, 2011

Page 10: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 10

Taking Your Business to the Next Level™

IdealNet, Inc.Page 10

Taking Your Business to the Next Level™

Business Application

• Key Parameters– Data Integration – regular movement of data– Data Migration – one time movement of data– Business Intelligence - loading the BI vendor “domain” – Data Warehouse, Data Mart– Master Data Management – complex and difficult movement of data– Virtual Master Data Management – new version of MDM based upon advanced data

virtualization technology

• Cautions– One “shoe does not fit all”– Cloud, Object Applications in the Cloud don’t fit well with ETL based technologies

• Best Practice– Get a metadata discovery tool (note the free Queplix offering amongst others)– Define clearly the data sources, amount of data to be moved, the keys that will help

make this work, the frequency of update, and the exact minimum of information to be moved/synchronized/federated prior to selecting a technology

Page 11: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 11

Taking Your Business to the Next Level™

IdealNet, Inc.Page 11

Taking Your Business to the Next Level™

Platform Deployment

• Key Parameters– On-premise– Cloud (Public and/or Private)– Software as a service (saas)

• Cautions– ETL tools are not designed for cloud models or software as a service– This includes large enterprise private cloud - problems are not limited to public

cloud– No basic security mechanisms in ETL for cloud deployment – Technologies such as QueCloud support VPN integration and optionally a

special security module that implements enterprise model for cloud to enterprise connectivity

• Best Practice– Understand the full potential for architectural expansion up front– ETL will not adapt to expanded horizons and this will create large problems for

you later

Page 12: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 12

Taking Your Business to the Next Level™

IdealNet, Inc.Page 12

Taking Your Business to the Next Level™

Connectivity to Data Sources

• Key Parameters– Direct database connectivity (insert/update/delete)– Application program interface (access only through API)– Flat file (EDI), Standards

• Cautions– Programming to proprietary vendor API’s on legacy software such as SAP®, Siebel®,

PeopleSoft® requires extensive application specific skills and may substantially increase the cost of a project

– Stick with SOA– Custom fields – how will you know they exist? How will your vendor handle this?– Cloud based applications are object oriented – not relational – find a toolset that works

well with objects – not just relational tables

• Best Practices– Use a metadata discovery tool (2nd time)– Use intelligent interfaces that eliminate the need to know a vendor API (Application

Software Blades™ - Queplix)– No vendor has every interface or ever will – understand their strategy to add your

application and ask them to do it!

Page 13: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 13

Taking Your Business to the Next Level™

IdealNet, Inc.Page 13

Taking Your Business to the Next Level™

Synchronization

• Key Parameters– Batch– Real-time or Near Real-time– Other

• Cautions– Real-time may mean an ACID transaction – you are in a transaction flow and

this is a very different problem than customer data alignment – this is really enterprise application integration as this sort of integration requires application level changes, not just data integration at the database level

– ETL often requires you set triggers in a database – database administrators don’t like this

• Best Practice– Update (synchronize or harmonize) when a “record of truth” changes and then

only update the fields that need updating – you don’t need to update an entire “row” if you choose the right technology set

– Understand strategies for real-time and near-time that don’t require explicit database invasive triggers

Page 14: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 14

Taking Your Business to the Next Level™

IdealNet, Inc.Page 14

Taking Your Business to the Next Level™

Transformation

• Key Parameters– String, Math, Boolean, Fuzzy Logic– Complex, Summarization, Statistical– Custom Transforms– 1st Generation Programming and SQL – 2nd Generation GUI Programming and SQL– 3rd Generation Automation, Check Box and Configuration, Excel®-like formula

builder

• Cautions– 1st and 2nd Generation products don’t mix well with business users – expect to

be programming and working with lots of details

• Best Practice– Find a 3rd Generation product – this should work like Excel® - transforms are

just simple formula’s

Page 15: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 15

Taking Your Business to the Next Level™

IdealNet, Inc.Page 15

Taking Your Business to the Next Level™

Data Movement

• Key Parameters– Bulk Data Movement– Data federation for BI deployment or equivalent– Record level synchronization

• Cautions– Map the problem to the technology solutions – once again fit is key

• Best Practice– ETL is the workhorse file mover for batch windows– Record level manipulation for data integration is best achieved by Advanced

Data Virtualization based products

Page 16: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 16

Taking Your Business to the Next Level™

IdealNet, Inc.Page 16

Taking Your Business to the Next Level™

Test, Development and Operations

• Key Parameters– Target User, Dashboard, Workflow– Security, Data Quality– Data dictionary – metadata repository– Shared library of transforms– Test, Development and Production environment– Reporting, Analytics, Graphics– Disaster recovery

• Cautions– How will your vendor support flow from test & development into production operations?

Automated back-up? – Integration with a software instance is significantly more limiting than a server based

product with persistence– Hub and spoke architectures require LDAP support and more – ETL doesn’t do this

• Best Practice– Understand the difference between test, development and production operations

environments - SMB customers need to understand this better

Page 17: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 17

Taking Your Business to the Next Level™

IdealNet, Inc.Page 17

Taking Your Business to the Next Level™

Data Modeling

• Key Parameters– Connection to data sources– Discovery of metadata structure in data sources– Representation of metadata structures in data sources– Automatic update of metadata in data sources– Semantic discovery support– Search of metadata across multiple sources– Direct access to underlying data – in a useful and navigable format – from

metadata– Ability to model all data sources– Virtual structures in metadata to facilitate mapping– Lineage of metadata and Metadata export

• Cautions– This should all be integrated – metadata discovery, data dictionary, semantic

discovery, data catalog access, data integration transforms

• Best Practice– Get a metadata discovery tool (3rd time!)

Page 18: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 18

Taking Your Business to the Next Level™

IdealNet, Inc.Page 18

Taking Your Business to the Next Level™

Data Quality and Data Governance

• Key Parameters– Basic data quality deeply integrated with data integration process

• Development and source clean-up• Production environment – ongoing automation?• The Cloud ***MUST*** support data quality – or you don’t have a solution

– Data governance – implementation by business rule• Cautions

– Data quality problems derail business intelligence and data integration all the time – don’t let it happen to you

– Some 1st and 2nd Generation and most 3rd Generation products integrate data quality – separate products, except for the largest organizations, probably don’t make sense

– Gartner® has noted the convergence of data quality with data integration and other toolsets

• Best Practics– Data integration proposals, without completely addressing data cleansing, data

quality and associated data governance guidelines won’t produce the results you expect (timeframe, quality, return on investment)

Page 19: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 19

Taking Your Business to the Next Level™

IdealNet, Inc.Page 19

Taking Your Business to the Next Level™

Architecture and Standards

• Key Parameters– Standalone– Networked– Scale from simple integration to virtual master data

management– SOA, JDBC, ODBC-.Net, Web Services– Other

• Cautions– ETL links are standalone “A to B” instances of connectivity –

as these multiply how will you vendor “manage” all of this?– Standards are standards – if SOA cost more that’s a big red

flag

• Best Practice– Use standards– Embrace enterprise strategies for data integration

“management” if it applies to you

Page 20: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 20

Taking Your Business to the Next Level™

IdealNet, Inc.Page 20

Taking Your Business to the Next Level™

Conclusions

• Established ETL Vendors– Informatica® is the leader of the pack for ETL – see release 9.1– Clearly wins and defines ETL for batch oriented bulk file transfer – What does Hadoop® file level integration support actually deliver? Missing Hadoop®

resident database support so there is not much today.– Talend® and Pentaho® are the open source leaders with associated ETL and BI

strategies – these are excellent choices.– ETL doesn’t really scale – hubs for MDM are all manually programmed, expensive to

setup and error prone – each connection is essentially standalone

• Data Virtualization Based Data Integration– Queplix technology is a solid 3rd Generation product which merits your review – they are

clearly the leader in data virtualization today– QueCloud (or the on-premise product, Virtual Data Manager™) integrate 2, 3, 4 or more

sources as uniquely enabled by advanced data virtualization – this reduces risk, lowers cost and provides much more capability

– Lower costs by 50% or more and increase savings as you add more application integrations

– Data virtualization is the future of data integration – persistent metadata servers do things that ETL technology cannot do – learn more about it

Page 21: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 21

Taking Your Business to the Next Level™

IdealNet, Inc.Page 21

Taking Your Business to the Next Level™

Sponsored Message From Queplix

• Sponsored Message (Queplix, Inc.)– Please go to www.queplix.com and download the Free Metadata Discovery

Tool or email [email protected] and they will set you up for free– Queplix will email out a copy of my report or follow this link to obtain a copy

directly from my website: http://www.idealnetinc.com/IdealNet_Analysis_of_Data_Integration_Technologies.pdf

– Queplix is running promotions for free QueCloud implementation turnkey – essentially all setup and 1 year free – please contact [email protected] for the terms and conditions or see www.netsuite.com for more about that promotion in the partner section

– Queplix has free video, without registration, accessible from their home page and from their collateral page (under “About”)

– Queplix has other video and white papers in a registration section

• This Presentation– You can request a copy via email from me [email protected] and my

team will follow-up promptly

Page 22: IdealNet Data Integration ETL vs Cloud

2933 West Germantown PikeBuilding 2, Suite 204Fairview Village, Pa 19409USA

Toll Free : 800 618-0836Fax : 610 666-1006Email : [email protected]

Toll Free : 800 618-0836Fax : 610 666-1006Email : [email protected]

Contact Christopher C. Biddle at [email protected]

Thank You

Visit us online at www.idealnetinc.com

Page 23: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 23

Taking Your Business to the Next Level™

IdealNet, Inc.Page 23

Taking Your Business to the Next Level™

Page 24: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 24

Taking Your Business to the Next Level™

IdealNet, Inc.Page 24

Taking Your Business to the Next Level™

Page 25: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 25

Taking Your Business to the Next Level™

IdealNet, Inc.Page 25

Taking Your Business to the Next Level™

Page 26: IdealNet Data Integration ETL vs Cloud

© 2011 IdealNet, Inc.Page 26

Taking Your Business to the Next Level™

IdealNet, Inc.Page 26

Taking Your Business to the Next Level™

Business Application & Platforms