Upload
anilallewar
View
6.923
Download
0
Embed Size (px)
Citation preview
DATA VIRTUALIZATION&INFORMATION AS A SERVICE (IAAS)
By Anil AllewarSenior Solutions Architect - Synerzip
1
About Me!!2
Anil Allewar
Senior Solutions Architect @ Synerzip
Technology Evangelist & speaker
Core interests: JEE, EAI, EII
• Use cases
Agenda3
• What does it mean?
• Implementation Frameworks• Demo• Questions?
• Architecture explained
Why it makes sense?4
Use Cases
Data Warehous
e
ETL
Financial
DataOLTP Data
ETL
3rd Party Data
Data Mart
ETL
Web Servic
e 1
Web Servic
e 2Legacy Data
Custom
Program
Excel files
5
Traditional Data Integration6
Enterprise Information System
ETL
Source System
Source System
ETLBusiness Applications
Problems with ETL 7
More than 1 copy of data for staging
Intermediate data => Errors
Lead time to add new source
Domain knowledge for mapping
Batch Process => No real time data
Problems with DBMS consolidation8
Alternate approach => Single EIS (say
RDBMS)
Extensive changes to existing apps
Might not satisfy everyone’s
requirements
• Use cases
Agenda9
• What does it mean?
• Implementation Frameworks• Demo• Questions?
• Architecture explained
Data Virtualization & Federation10
Single API to access data
Only metadata stored at
virtualization layerReal time access without
copying/moving data Federate data
across hetero/homogenou
s sources
Data Virtualization11
• Use cases
Agenda12
• What does it mean?
• Implementation Frameworks• Demo• Questions?
• Architecture explained
Architecture13
UserApplicati
on
Com
mon
Acc
ess
API
Connector 1
Connector 2
RUNTIME & QUERY ENGINE
VirtualDatabase
Translator 1
Translator 2
• Use cases
Agenda14
• What does it mean?
• Implementation Frameworks• Demo• Questions?
• Architecture explained
Vendors15
Commercial Products Composite Software
http://www.compositesw.com/data-virtualization/ Denodo
http://www.denodo.com/en/product/overview.php?n=h IBM
http://www-03.ibm.com/software/products/en/ibminfofedeserv Informatica
http://www.informatica.com/us/data-virtualization/ Red Hat
http://www.redhat.com/products/jbossenterprisemiddleware/data-virtualization/ Open Source
Jboss Teiid http://teiid.jboss.org/
Selected Platform – JBoss Teiid16
Open Source
Number of relational/NoSQL/ERP/CRM data
stores
JEE standards
Add custom EIS support using
JEE componentsActive &
responsive community Synerzip contribution:
Defect discovery, root cause analysis, feature
verification
Teiid Components17
Virtual Database container for components used to integrate data
from multiple data sources Source Models
structure and characteristics of physical data sources View Models
structure and characteristics of abstract structures you want to expose to your applications
Teiid Designer Eclipse based UI to dynamically discover data
source objects and apply data federation Generate virtual database from 1 or more sources
Teiid Components18
Translator Provides abstraction later between Teiid Query
Engine and source system Convert Teiid SQL commands to source specific
execution commands Convert result data from source system to Teiid
specific format Resource Adapter
Provides connectivity to the physical data source Integration provided through Java Connector
Architecture (JCA) API
Teiid – Supported EIS Amazon SimpleDB Apache Accumulo Apache SOLR Cassandra File Google Spreadsheet JPA LDAP Excel – as file SalesForce
JDBC MS access, DB2, derby,
excel-odbc, greenplum, h2 , hive(for accessing Hadoop), oracle, teradata and most RDBMS
MongoDB Object OData OLAP Web Services SAP Netweaver Gateway
19
Performance Characteristics20
Access same data using Oracle and Teiid drivers
Retrieval times comparable when accessing tables having no Blobs
0
5,000
10,000
15,000
20,000
25,000 No. of rows Vs Time: No Blobs
Oracle-JDBCTeiid-JDBC
No. of rows
ms
Performance Characteristics21
Teiid slower when accessing Blob data Can be tuned
0 0 2 42 21,804 32,531 185,4540
5,00010,00015,00020,00025,00030,000
No. of rows Vs Time: Blobs
Oracle-JDBCTeiid-JDBCm s
No. of rows
• Use cases
Agenda22
• What does it mean?
• Implementation Frameworks• Demo• Questions?
• Architecture explained
Demo23
JDBC Clien
t
JDBC AP
I
RDBMS Resource Adapter
MongoDB Resource Adapter
TEIID RUNTIME & QUERY ENGINE
Federated VDB
mySQL Translat
orMongoD
B Translat
or mySQL
Demo-Steps24
Pre-requisites mySQL server 5.5+ installed MongoDB 2.4.x+ installed
Steps Load the mySql and MongoDB database with sample data Setup environment – JBoss, Eclipse Create Teiid project in Eclipse using Teiid designer
Import source model using JDBC Create the virtual model and federate data from the source
model Create a virtual database (VDB) and deploy to JBoss
Access data using JDBC client or through browser using OData
Demo – Scenario25
Federated
Data
Demo – Connection Profile26
Demo – Source Model27
Demo - Source Model Generation28
Demo – Map Source To View29
Demo - Association30
Demo – Data Federation31
Demo – Source Code32
Source code https://github.com/anilallewar/JBoss-Teiid Contains
Configuration files Instructions “How-to” videos VDBs, source models and view models
Conclusion33
Data Virtualization and Federation is a rapidly emerging technology that solves traditional BI/ETL problems.
It provides lower time to market, distributes data across the enterprise as a service and provides real time access to enterprise data.