Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Data Virtualization leads the CIO agenda
Dr. Anastasio Molano, SVP
Denodo Strategic Technical Office
Agenda1.Data Virtualization Overview
2.Data Virtualization Use Cases
1. Logical Data Warehouse
2. Big Data Lakes
3. Single Views
4. Digital Transformation
3.Customer Success Stories
Data Virtualization
4
The Business NeedReady Access to Critical Information to Support Business Processes
4
MarketingSales ExecutiveSupport
Customers
Invoices Products
Service Usage
Access to complete information:
business entities and pre-integrated
views
Access to related information:
discovery and self service
Access in real-time from different
apps and devices
5
Information spread acrossdifferent systems
IT responds with point-to-point data integration
Takes too long to get answers to business users
The ChallengeData Is Siloed Across Disparate Systems
MarketingSales ExecutiveSupport
Database
Apps
Warehouse Cloud
Big Data
Documents AppsNo SQL“Data bottlenecks create business bottlenecks.”– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015
6
The SolutionData Abstraction Layer
Abstracts access to disparate data sources
Acts as a single repository (virtual)
Makes data available in real-time to consumers
DATA ABSTRACTION LAYER
“Enterprise architects must revise their data architecture to meet the demand for fast data.”
– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015
Data Virtualization – 3 Simple Steps
Data virtualization
combines disparate data
sources into a single
“virtual” data abstraction
layer (aka information
fabric) that provides
unified access and
integrated data services
to consuming
applications in real-time
(right-time).
Connect &
Virtualize
Combine &
Integrate
Publishas a Data Service
Local Database XML Documents
Name | Description | Price Name | Description | Price
U
∞
Name | Manufacturer | Score
Name | Description | Manufacturer | Price | Score
Application
Browser
ResultsSQL Query
Data Service(SQL, SOAP, REST, Widgets…)
7
Derived View
Base View
Base View
Base View
8
Benefits of Data Virtualization“Get it Real-time and Get it Fast!”
Better Data Integration
Lower integration costs by 80%.
Flexibility to change.
Real-time (on-demand) data services.
Complete Information
Focus on business information needs.
Include web / cloud, big data, unstructured, streaming.
Bigger volumes, richer/easier access to data.
Better Business Outcome
Projects in 4-6 weeks.
ROI in <6 months.
Adds new IT and business capabilities
“Benefits of Data Virtualization: get it real-time and get it fast!”– William McKnight, President, McKnight Consulting Group
9
Data VirtualizationReal-time Data Integration
“Data virtualization integrates disparate data sources in real time or near-real time to meet demands for analytics and transactional data.”
– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015
Publishes the data to applications
Combinesrelated data into views
Connectsto disparate data sources
2
3
1
Data Virtualization Use Cases
Enterprise DV @ Confluence of Ecosystems
11
Key Enabler to Large Projects in an Increasingly Data-Driven World
Logical Data Warehouse
for agile BI
Big Data Lakes
for advanced analytics
DigitalTransformation
for legacy modernization& data marketplaces
DataVirtualization
“You Need A Data Virtualization Strategy To Succeed; Without One, You Risk Falling Behind. Without data virtualization, you risk knowing less about your customers. You’ll get fewer real-time business insights, lose your competitive advantage, and spend more to address data challenges. Firms that invest in data virtualization technologies will respond more quickly, deliver more and better products, and grow faster than their competitors.”
Forrester: Information Fabric 3.0
Single Views
for Operational Efficiency
12
Common Data Virtualization Use CasesData Virtualization
BIG DATA LAKES
Advanced Analytics
Data Warehouse Offloading
Big Data for Enterprise
Cloud / SaaS Integration
LOGICAL DATA WAREHOUSE
BI Semantic Layers
Virtual Data Marts
Self-Service BI
Operational BI / Analytics
SINGLE VIEW APPLICATIONS
Single Customer View - Call Centers, Portals
Single Product View - Catalogs
Single Inventory View - Inventory Reconciliation
Vertical Specific - Single View of Wells
DIGITAL TRANSFORMATION
Unified Data Services Layer
Logical Data Abstraction
Agile Application Development
Linked Data Services
Logical Data WarehouseFrom the EDW to the Logical Data Warehouse
14
Data Virtualization as the Data Integration LayerThe Logical Data Warehouse: Data Virtualization as Data Integration/Semantic
Layer
Analytics/BI
Data Virtualization
EDW ODS
• Initially identified by Gartner as a Best Practice in 2012
• Move data integration and semantic layer to independent Data Virtualization platform
• Purpose built for supporting data access across multiple heterogeneous data sources
• Separate layer provides semantic models for underlying data
• Physical to logical mapping
• Speeds up development on big data sources
• Offers an evolutionary adoption of big data
• Enforces common and consistent security and governance policies
• Makes big data available to everyone – higher big data ROI
15
Data Warehouse Architectural Options
Source: Mark. Beyer at the Gartner BI & Analytics Summit, London, March 2016
The Next Generation of EDWData Virtualization is Key to Integrating, Abstracting and Delivering Value to Users
16
Source: Forrester, Information Fabric 3.0 Report
17
How can I abstract consuming applications from technology change and requirements evolution ?
How can I enforce consistent security and governance policies across all data sources ?
How can I combine data from several systems ensuring good performance ?
The Logical Data Warehouse:
The Logical Data WarehouseIntegrated View of a Plurality of systems: Hadoop, EDW, Streaming, In-memory,...
18
Architecture of the Logical Data Warehouse
Real-TimeDecision
Management
Alerts
ScorecardsDashboards
Reporting
Data DiscoverySelf-Service
Search
Predictive Analytics
Statistical Analytics (R)
Text Analytics
Data MiningData Warehouse
Sensor Data
Machine Data (Logs)
Social Data
Clickstream Data
Internet Data
Image and Video
Enterprise Content (Unstructured)
Big Data
Enterprise Applications
Traditional Enterprise
Data
Cloud
Cloud Applications
NoSQL
EDWIn-Memory
(SAP Hana, …)Analytical
Appliances
Cloud DW(Redshift,..)
ODS
Big DataETL
CDC
Sqoop
(Flume, Kafka, …)
Data Virtualization
Real-Time Data Access (On-Demand / Streaming)
Data Caching
Data
Serv
ices
Data Search & Discovery
Governance
Security
Optimization
Data
Abstr
action
Data
Tra
nsfo
rmation
Data
Federa
tionBatch
YARN / Workload Management
HDFS
HiveSparkDrill
Impala
Storm HBase SolrHunk
DW Streams NoSQL SearchSQL
Hadoop
TezMapRed.
18
19
Performance ComparisonAccess through Data Virtualization versus Direct Access to Data Source
TPCDS Benchmark Tests using JDBC with IBM Netezza as data source with 10 Gbps LAN networkResults in seconds
When queries only hit an individual source, the data virtualization layer pushes the processing completely to the source with minimal overhead
As a note, since data needs to flow through the DV layer, the network between sources and DV should be broad to avoid network bottlenecks
20
Performance ComparisonLogical Data Warehouse vs. Physical Data Warehouse
Denodo DV Query across 3 data sources Vs.
ETL all data into Netezza and run full query there
vs.Sales Facts290 M rows
Items Dim.400 K rows
Customer Dim.2 M rows
Customer Dim.
2 M rows
Items Dim.400 K rowsSales Facts
290 M rows
21
Performance ComparisonLogical Data Warehouse vs. Physical Data Warehouse
Query DescriptionReturned
Rows
Avg. Time
Physical
(Netezza)
Denodo
Avg. Time
Logical
Optimization
Technique
(automatically
chosen)
Total sales by
customer1.99 M 21.0 sec 21. 5 sec
Full aggregation
push-down
Total sales by
customer and year
between 2000 and
2004
5.51 M 52.3 sec 59.1 secFull aggregation
push-down
Total sales by item
brand31.4 K 4.7 sec 5.3 sec
Partial aggregation
push-down
Total sales by item
where sale price less
than current list price
17.1 K 3.5 sec 5.2 secOn the fly data
movement
22
Performance ComparisonLogical Data Warehouse vs. Data Blending in a BI Tool
SELECT c.id, SUM(s.amount) as total
FROM customer c JOIN sales s
ON c.id = s.customer_id
GROUP BY c.id
System Execution Time Data Transferred
Optimization Technique
(automatically selected)
Denodo 9 sec. 4 M Aggregation push-down
Tableau 125 sec. 292 M None: full scan
Join
Group By
290 M 2 M
Sales Customer
Group By
Join
2 M
2 M
Sales Customer
23
Data Virtualization Scalability
SQL Cluster:Denodo1:9999Denodo2:9999Denodo3:9999Denodo4:9999
Web Cont. Cluster:Denodo1:9090Denodo2:9090Denodo3:9090Denodo4:9090
Virtual ServerSQL Cluster: 192.168.0.10:9999Web Container Cluster: 192.168.0.10:9090
Load Balancer Shared Cache Server
Denodo can be deployed in a
cluster for HA and horizontal
scaling
“Shared-nothing” execution
engine ensures linear
scalability
Based on the use of an
external load balancer
Supports auto-scaling for cloud
deployments (like AWS)
24
T-MobileIntegrated Data Warehouse
24
Vision
“The Integrated Data Warehouse (IDW) is a scalable BI platform that can adapt to the speed of the business by providing relevant, accessible, timely, connected, and accurate data.”
25
T-MobileIntegrated Data Warehouse
25
26
AutodeskBig Data, Self-Service Analytics, Logical Data Warehouse
26
Benefits
Successfully transitioned to subscription-based licensing
Uniform data environment for access
Transition out of physical data warehouse in future
27
AutodeskBig Data, Self-Service Analytics, Logical Data Warehouse
27
Big Data LakesSemantic layer, Data Governance, Security for
Data Lakes
Big Data Lakes
29
Source: Mark. Beyer at the Gartner BI & Analytics Summit, London, March 2016
Big Data Lakes
ManagementAcquisition Delivery
Data Lake
Bu
sin
ess A
ccess
Layer
Data Warehouse
Discovery Zone
Systems of Record
Oracle
SQL
Other
Exadata
Data Access Layer
Applications
Reports
Dashboards
DM
Queries
EDI
Future Capabilities
Data Gateway
Lan
din
g Z
on
e
HCO
XYZ
Data Aggregators
Raw Data Useful Information
Data Owners, Technical Support Data Stewards, Data SME/Scientist, Analyst, Advisors
Advisors, Analysts, Members,Collaboratives
DM
DM
Data Owners, Technical SupportData Stewards, Data SME’s, Data Scientists,
Analyst, Advisors, Data QAAdvisors, Analysts, Members,
Collaboratives
Discovery Zone
Posts
30
31
Benefits
CaterpillarSelf-service / Predictive Analytics – IoT Integration
Improved asset performance and proactive maintenance
Increased revenue from sale of services and parts
Reduced warranty costs of parts failure
Dealer
Maintenance
Parts Inventory
OSI PI Hadoop Cluster
Tableau: Dealer / Customer Dashboard
Big Data Lake and Data VirtualizationAgile Data Access to Traditional Data, Data Lake, External Data Sources
Data Virtualization
Simplified data access
Less training for business users
Faster data discovery
Augmented discovery process (adding new sources)
Data Lake approach on Hadoop
Simplifies data management
Reduces data costs
Scalable
Flexibility
32
33
-Chuck DeVries VP Architecture and Development Vizient
The Denodo Platform will provide 350% ROI over 5 years and break
even within 1.5 years of our initial project and will continue to deliver
additional savings every year in our data lake project.”
Single ViewsSingle View of Customer, Product, etc., Virtual
MDM
Data Virtualization Drives Agility Across EA
Agile BI &Agile Apps
Data Services
Any Data, Anywhere
IntegrateNew Sources
OptimizedPerformance
35
“You Need A Data Virtualization Strategy To Succeed; Without One, You Risk Falling Behind. Without data virtualization, you risk knowing less about your customers. You’ll get fewer real-time business insights, lose your competitive advantage, and spend more to address data challenges. Firms that invest in data virtualization technologies will respond more quickly, deliver more and better products, and grow faster than their competitors.”
Forrester: Information Fabric 3.0 (August 2013)
36
Single Views for Customer ServiceSingle View of the Customer for a Unified Customer Service Desk
36
Benefits
Significant reduction of FCR which led to a 50% reduction of workload in back-office
Used across all business functions: customer claims, customer activation, loyalty programs, upselling, order fulfilment, provisioning, logistics, …
Benefits
Virtual Customer Golden Record
Unified Data Services Layer
Integrates MDM basic data withtransactional data
Provides data services to multipleconsuming applications: agents, distributors, sales, customer service,
Single Customer Source of Truthacross business lines
Integrated with ESB through Web Services
37
Insurance Agents
MDM Basic Customer
Reference Data
■ Customer Data
■ Keys to other
systems
LoBDatabases
■ Health, Cars,
House, Enterprise,
Retirement Funds, etc.
Marketing Database
■ Customer indicators
■ Churn index
■ Segmentation
CRM Customer contact history
Claims Database
■ Customer claims
Partners / Distributors
SalesCustomer Service
SOA - ESB
Customer Golden Record
Benefits
2020 Information Architecture
ODATA based Enterprise Data Services Layer
Real-time 360º views of business
Faster Business Decision Making
Faster Time-to-Value for Deals, Claims Management and otherdepartments
38
Denodo Data Virtualization Layer
OData SQL Other
Excel/MS BI
TableauPower
BI
Employee Composite Desktop
360 Views
Swiss Re
Cockpit
Other Applicatio
ns
CMP DWH
(Oracle)
Case Hub
(Odata)
OIS(RESTfu
l)
SICS(DB2)
TM1 (Excel)
eBes(SOAP)
MDM(Oracle)
Digital TransformationData Service Layers, Enterprise Data
Marketplaces
40
BoeingIntegrated Product Systems Data Across Major Divisional Companies
40
64 cores
64 cores
84,000 Base Views, 100’s Derived Views
41
Issues
IntelData Service Layer for streamlining business processes in the value chain
“Lack of consistent capability to integrate data from disparate data sources and deliver using agile standardized methods.
• Intel’s data is globally distributed across heterogeneous tools & technologies• New data sources (ex: big data) & consumers (ex: emergence of SaaS)• New information exchange channels (ex: mobility) ”.
Source: Intel EDW 2015
42
Solution Architecture
IntelData Service Layer for streamlining business processes in the value chain
43
Solution Architecture
IntelData Service Layer for streamlining business processes in the value chain
“Used of logical model in the Data Virtualization layer as canonical models”.Source: Intel EDW 2015
44
Use Case: Supplier Master Data
IntelData Service Layer for streamlining business processes in the value chain
Supplier Master Data: information about companies that Intel purchases from, pays, outsource manufactures with
Choosing a Supplier is the point of entry to many business process. If it fails or is slow, it impacts all 70+ downstream consumers
Source: Intel EDW 2015
45
Use Case: MySamples
IntelData Service Layer for streamlining business processes in the value chain
Need to show the latest status of samples requests. • Customer information from
MySamplesapp• Samples request
information (if requested) from ERP system
• Samples shipment status (if shipped) from Event Management system
Source: Intel EDW 2015
46
Use Case: Cloud CRM Integration
IntelData Service Layer for streamlining business processes in the value chain
Integrate Customer data from various on premise data sources & expose it as service to Cloud CRM.
Source: Intel EDW 2015
47
Use Case: Extreme Data Warehouse
IntelLogical Data Warehouse
Logical Data Warehouse over
the EDW and Hadoop.
Sales & marketing analytics
hub.
Source: Intel EDW 2015
48
Data Virtualization
IntelData Service Layer for streamlining business processes in the value chain
“An agile method that simplifies information access”. Source: Intel EDW 2015
49
Benefits
IntelData Service Layer for streamlining business processes in the value chain
“Reduced time-to-market, increased agility, end-to-end manageability”Source: Intel EDW 2015
50
Metrics Detail
IntelData Service Layer for streamlining business processes in the value chain
Value Driver Metric Goal Actual
Time to Develop Time to develop web service in days 50% 90%
Time to Deploy Time to Deploy web service in days 50% 90%
TTM Overall time it takes to make web
service available for use60% 90%
Time to Engage Time it takes for business to engage
with IT75% 75%
Performance Performance of web services 50% 60%
Impact Analysis How fast can we perform impact
analysis50% 90%
Enterprise Architectural
Alignment
Ease at which data from disparate
sources can be integratedSecurity, data
classification
High
51
IntelData Service Layer for streamlining business processes in the value chain
52
Summary
IntelData Service Layer for streamlining business processes in the value chain
“An agile data integration method that simplifies information access
–Agility, time-to-market, Manageability & Reuse
Created enterprise standards to promote understandability, reduce chaos, and help drive consistency”
Source: Intel EDW 2015
53
Data Virtualization in the Cloud
54
Hybrid On-premise - Cloud
55
Data Marketplace and Self-Service
56
Data Marketplace and Self-Service
57
Data Marketplace and Self-Service
58
Data Marketplace and Self-Service
59
Data Marketplace and Self-Service
About Denodo
61
DenodoThe Leader in Data Virtualization
HEADQUARTERS
Palo Alto, CA.
DENODO OFFICES, CUSTOMERS, PARTNERS
Global presence throughout North America, EMEA, APAC, and Latin America.
LEADERSHIP
Longest continuous focus on data
virtualization and data services.
Product leadership.
Solutions expertise.
CUSTOMERS
250+ customers, including many
F500 and G2000 companies across every major industry have gained significant business agility and ROI.
62
HEADQUARTERS
Palo Alto, CA.
DENODO OFFICES, CUSTOMERS, PARTNERS
Global presence throughout North America, EMEA, APAC, and Latin America.
CUSTOMERS
250+ customers, including many
F500 and G2000 companies across every major industry have gained significant business agility and ROI.
LEADERSHIP
Longest continuous focus on data
virtualization and data services.
Product leadership.
Solutions expertise.
62
THE LEADER IN DATA VIRTUALIZATION
Denodo provides agile, high performance data
integration and data abstraction across the broadest
range of enterprise, cloud, big data and unstructured
data sources, and real-time data services at half the
cost of traditional approaches.
63
Marquee Customers Across Multiple Verticals250+ customers across 30+ industries
Public Sector
Financial Services
Telecommunications
Healthcare
Technology
Manufacturing
Insurance
Retail
Pharma / Biotech
Energy
64
Technology and Service PartnersPartner Network
TECHNOLOGY PARTNERS
SYSTEMS INTEGRATORS & SOLUTIONS CONSULTANTS
65
Awards and RecognitionsFrom leading analysts and companies
2015 Magic Quadrant for Data Integration Tools
2015 Leader in Forrester Wave: Enterprise Data
Virtualization.
2015 Technology Innovation Award for Information
Management
2015 #1 Readers Choice Awards For Data
Virtualization Platforms
2015 Rank Companies that
Matters Most in Data
2015 Big Data 50 –Companies Driving
Innovation
2015 Leadership Award in Big Data
For Denodo Customer Autodesk
Trend-Setting Products in Data and Information Management for 2016
2016 Premier 100 Technology Leader
For Denodo Customer CIT
Analyst Recognition Denodo Product AwardsCustomer Awards
66
Award-Winning Data Virtualization LeaderForrester Wave: Enterprise Data Virtualization
Forrester Wave: Enterprise Data Virtualization, Q1 2015
2015 Magic Quadrant for Data Integration
Tools
2015 Leader in Forrester Wave: Enterprise Data
Virtualization.
2015 Technology Innovation Award for
Information Management
2015 #1 Readers Choice Awards For Data
Virtualization Platforms
2015 Rank Companies that Matters Most in
Data
2015 Big Data 50 –Companies Driving
Innovation
2015 Leadership Award in Big Data
For DenodoCustomer Autodesk
Trend-Setting Products in Data and Information Management for 2016
2016 Premier 100 Technology Leader
For Denodo Customer CIT
67
Denodo is a Recognized Leader in DVForrester Wave: Enterprise Data Virtualization
Forrester Wave: Enterprise Data Virtualization, Q1 2015
Customers appreciate Denodo’s easy-to-use,
tight integration with non-structured data, simple
yet sophisticated data modeling capabilities,
search, data life-cycle management, and support
for comprehensive lists of data sources including
big data, IoT, and cloud sources. In addition,
customers often mention lower cost and faster
time-to-value are top reasons for selecting
Denodo.”
– Analyst Noel Yuhanna, Forrester Research
Q&A
Thanks!
www.denodo.com [email protected]
© Copyright Denodo Technologies. All rights reservedUnless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.