View
668
Download
4
Embed Size (px)
DESCRIPTION
Citation preview
GAIN BETTER INSIGHTS FROM BIG DATA
USING RED HAT JBOSS DATA VIRTUALIZATION
Red Hat CorporationJanuary 5, 2014
2 RED HAT Confidential
Red Hat is…
“By running tests and executing numerous examples for specific teams, we were able to prove […] not only would the solution work, but it will perform better & at a fraction of the costs.”
MICHAEL BLAKE, Director, Systems & Architecture
“By running tests and executing numerous examples for specific teams, we were able to prove […] not only would the solution work, but it will perform better & at a fraction of the costs.”
MICHAEL BLAKE, Director, Systems & Architecture
3 RED HAT Confidential
Agenda
● Data challenges getting bigger● Red Hat Big Data Strategy and Platform● Data Virtualization Overview ● Customer Use Case for Big Data integration using Data
Virtualization● Demo● Q&A
4 RED HAT Confidential
Data Driven Economy
Data is becoming the new raw material of business: an economic input almost on a par with capital and labor. “Every day I wake up and ask, ‘how can I flow data better, manage data better, analyze data better?”
CIO - Wal-Mart
Data is becoming the new raw material of business: an economic input almost on a par with capital and labor. “Every day I wake up and ask, ‘how can I flow data better, manage data better, analyze data better?”
CIO - Wal-Mart
5 RED HAT Confidential
Data Challenges Getting BiggerBig Data, Cloud, and MobileExisting Data Integration approaches are not sufficient● Extracting and moving data adds latency and cost
● Every project solves data access and integration in a different way
● Solutions are tightly coupled to data sources
● Poor flexibility and agility
BI Reports Operational Reports
Enterprise Applications
SOA Applications
Mobile Applications
Hadoop NoSQL Cloud Apps Data Warehouse & Databases
Mainframe XML, CSV& Excel Files
Enterprise Apps
Integration Complexity
Constant Change
Siloed &Complex
How to align?
6 RED HAT Confidential
Business ObjectiveTurn Data into Actionable Information
Over 70%BI project efforts lies in the integration of source data
Only 28%Users have any meaningful
data access Reduce costs for finding and accessing highly fragmented data
Improve time to market for new products and services by simplifying data access and integration
Deliver IT solution agility necessary to capitalize on constantly changing market conditions
Transform fragmented data into actionable information that delivers competitive advantage
7 RED HAT Confidential
Capture Process Integrate
Red Hat’s Big Data Strategy
● Reduce Information Gap thru cost effectively making ALL data easily consumable for analytics
Data
Anal ytics
Data to Actionable Information Cycle
8 RED HAT Confidential
Red Hat Big Data Platform
HadoopIntegration
JBoss DataVirtualization
RHEL Platform Integration
& Optimization
Hadoop On Red Hat Storage
Storage
Cloud /VirtualizationMiddleware
Hadoop On
OpenStack
Platform
Hadoop
on
FedoraApache Hadoop
FedoraBig Data SIG
Hadoop Distributions
9 RED HAT Confidential
Red Hat Big Data Platform
RHEL Platform Integration
& Optimization
Hadoop On Red Hat Storage
Storage
Cloud /Virtualization
MiddlewareHadoop
On OpenStack
HadoopIntegration
JBoss DataVirtualization
Platform
Hadoop
on
FedoraApache Hadoop
FedoraBig Data SIG
Hadoop Distributions
10 RED HAT Confidential
What does Data Virtualization software do?Turn Fragmented Data into Actionable Information
Data Virtualization software virtually unifies data spread across various disparate sources; and makes it available to applications as a single consolidated data source.
The data virtualization software implements 3 steps process to bridge data sources and data consumers:
● Connect: Fast access to data from diverse data sources
● Compose: Easily create unified virtual data models and views by combining and transforming data from multiple sources.
● Consume: Expose consistent information to data consumers in the right form thru standard data access methods.
Virtual Consolidated Data SourceVirtual Consolidated Data Source
BI Reports
Data Virtualization Software• Consume• Compose• Connect
SAP Salesforce.comOracle DW Hadoop
Siloed & Complex
VirtualizeAbstractFederate
Easy,Real-time
InformationAccess
SOA Applications
DATA CONSUMERS
DATA SOURCES
11 RED HAT Confidential
Turn Fragmented Data into Actionable Information
ConnectConnect
ComposeCompose
ConsumeConsume
Unified Customer View
Unified Customer View
Unified Product View
Unified Product View
Unified Supplier View
Unified Supplier View
BI Reports & AnalyticsMobile Applications
SOA Applications & Portals
Unified Virtual Database / Common Data Model
ESB, ETL
Native Data ConnectivityNative Data Connectivity
Standard based Data ProvisioningJDBC, ODBC, SOAP, REST, OData
Standard based Data ProvisioningJDBC, ODBC, SOAP, REST, OData
JBoss Data Virtualization
Data Consumers
Data Sources
Design ToolsDesign Tools
DashboardDashboard
OptimizationOptimization
CachingCaching
SecuritySecurity
MetadataMetadata
Hadoop NoSQL Cloud Apps Data Warehouse & Databases Mainframe
XML, CSV& Excel Files
Enterprise Apps
Siloed & Complex
VirtualizeAbstractFederate
Easy,Real-time
InformationAccess
12 RED HAT Confidential
JBoss Data Virtualization:Supported Data SourcesEnterprise RDBMS:• Oracle • IBM DB2 • Microsoft SQL Server• Sybase ASE• MySQL• PostgreSQL• Ingres
Enterprise EDW:• Teradata • Netezza • Greenplum
Hadoop:• Apache• HortonWorks• Cloudera• More coming…
Office Productivity:• Microsoft Excel • Microsoft Access• Google Spreadsheets
Specialty Data Sources:• ModeShape
Repository• Mondrian• MetaMatrix• LDAP
NoSQL:• JBoss Data Grid• MongoDB • More coming…
Enterprise & Cloud Applications:• Salesforce.com• SAP
Technology Connectors:• Flat Files, XML Files,
XML over HTTP• SOAP Web Services• REST Web Services• OData Services
13 RED HAT Confidential
Key New Features and Capabilities● Data connectivity enhancements
– Hadoop Integration (Hive – Big Data),
– NoSQL (MongoDB – Tech Preview) and JBoss Data Grid
– Odata support (SAP integration)
● Developer Productivity improvements – New VDB Designer 8 and integration with JBoss Developer Studio v7
– Enhanced column level security,
– VDB import/reuse, and native queries
● Simplify deployment and packaging – Requires JBoss EAP only; included with subscription
– Remove dependency with SOA Platform
● Business Dashboard– New rapid data reporting/visualization capability
14 RED HAT Confidential
Self-Service Business Intelligence
The virtual, reusable data model provides business-friendly representation of data, allowing the user to interact with their data without having to know the complexities of their database or where the data is stored and allowing multiple BI tools to acquire data from centralized data layer. Gain better insights from Big Data using JBoss Data Virtualization to integrate with existing information sources.
360◦ Unified View
Deliver a complete view of master & transactional data in real-time. The virtual data layer serves as a unified, enterprise-wide view of business information that improves users’ ability to understand and leverage enterprise data.
Agile SOA Data Services
A data virtualization layer deliver the missing data services layer to SOA applications. JBoss Data Virtualization increases agility and loose coupling with virtual data stores without the need to touch underlying sources and creation of data services that encapsulate the data access logic and allowing multiple business service to acquire data from centralized data layer.
Regulatory Compliance
Data Virtualization layer deliver the data firewall functionality. JBoss Data Virtualization improves data quality via centralized access control, robust security infrastructure and reduction in physical copies of data thus reducing risk. Furthermore, the metadata repository catalogs enterprise data locations and the relationships between the data in various data stores, enabling transparency and visibility.
● JBoss Data Virtualization – Use Cases
15 RED HAT Confidential
Retail Customer Use CaseGain Better Insight from Big Data for Intelligent Inventory Management
● Objective:
– Right merchandise, at right time and price
● Problem:
– Cannot utilize social data and sentiment analysis with their inventory and purchase management system
● Solution:
– Leverage JBoss Data Virtualization to mashup Sentiment analysis data with inventory and purchasing system data. Leveraged BRMS to optimize pricing and stocking decisions.
ConsumeComposeConnect
Analytical Apps
JBoss Data Virtualization
Hive
Inventory Databases
Purchase Mgmt Application
SentimentAnalysis
JBoss BRMS
Data Driven Decision
Management
Big Data integration use case
16 RED HAT Confidential
Better Together - Big Data and Data VirtualizationHadoop not another Silo - Customers Combine Multiple Technologies
● Combine structured and unstructured analysis– Augment data warehouse with additional external sources, such as
social media
● Combine high velocity and historical analysis– Analyze and react to data in motion; adjust models with deep historical
analysis
● Reuse structured data for analysis– Experimentation and ad-hoc analysis with structured data
17 RED HAT Confidential
● Better Together - Big Data and Data VirtualizationCapture, Process and Integrate Data Volume, Velocity, Variety
HadoopHadoop
Data IntegrationJBoss Data Virtualization
In-memory CacheJBoss Data Grid
BI Analytics (historical, operational, predictive)
BI Analytics (historical, operational, predictive) SOA Composite ApplicationsSOA Composite Applications
Messaging and Event Processing JBoss A-MQ and JBoss BRMS
J
Structured DataStructured DataStreaming
DataStreaming
DataSemi-Structured
DataSemi-Structured
Data
Red Hat Storage
Red Hat Enterpr ise Linux &
Virtu alizatio nCap
ture
& P
roce
ss I
nte
gra
te &
An
alyz
e
18 RED HAT Confidential
Inconsistent, Incomplete Information
Uninformed, Delayed Decisions
Costly Business Risk and Exposure
Consider...
How would your organization change…● If data were readily reusable in place rather than
requiring significant effort to build new intermediary data tiers?
● If data could be repurposed quickly into new applications and business processes?
● If all applications and business processes could get all of the information needed in the form needed, where needed and when needed?
19 RED HAT Confidential
● Red Hat JBoss Middleware
FoundationFoundation
Data IntegrationData Integration
Application IntegrationApplication Integration
Business ProcessManagementBusiness ProcessManagement
User InteractionUser Interaction
Develop ment
ToolshDevelop m
entToolsh
Manage m
entToolsM
anage ment
Tools
• JBoss EAP• JBoss Web Server• JBoss Data Grid
• JBoss Data Virtualization
• JBoss A-MQ• JBoss Fuse• JBoss Fuse Service Works
• JBoss BRMS• JBoss BPM Suite
• JBoss Portal
•JBoss D
eve loper Stud io
•JBoss O
per ations Ne tw
ork
ACCELERATE INTEGRATE AUTOMATE
DemoBig Data Integration using JBoss Data Virtualization
21 RED HAT Confidential
Demo Scenario
● Objective:– Determine if sentiment data from the
first week of the Iron Man 3 movie is a predictor of sales
● Problem:– Cannot utilize social data and
sentiment analysis with sales management system
● Solution:– Leverage JBoss Data Virtualization to
mashup Sentiment analysis data with ticket and merchandise sales data on MySQL into a single view of the data.
ConsumeComposeConnect
Excel Powerview and DV Dashboard to
analyze the aggregated data
JBoss Data Virtualization
Hive
SOURCE 1: Hive/Hadoop contains twitter data including sentiment
SOURCE 2: MySQL data that includes ticket and
merchandise sales
22 RED HAT Confidential
Demonstration System Requirements• JDK
– Oracle JDK 1.6, 1.7 or OpenJDK 1.6 or 1.7
• JBoss Data Virtualization v6 Beta– http://jboss.org/products/datavirt.html
• JBoss Developer Studio– http://jboss.org/products
• JBoss Integration Stack Tools (Teiid)– https://devstudio.jboss.com/updates/7.0-development/integration-stack/
• Slides, Code and References for demo– https://github.com/DataVirtualizationByExample/Mashup-with-Hive-and-MyS
QL• Hortonworks Data Platform (A VM for testing Hive/Hadoop)
– http://hortonworks.com/products/hdp-2/#install
• Red Hat Storage– http://www.redhat.com/products/storage-server/
23 RED HAT Confidential
24 RED HAT Confidential
25 RED HAT Confidential
26 RED HAT Confidential
27 RED HAT Confidential
28 RED HAT Confidential
29 RED HAT Confidential
30 RED HAT Confidential
31 RED HAT Confidential
32 RED HAT Confidential
33 RED HAT Confidential
34 RED HAT Confidential
35 RED HAT Confidential
36 RED HAT Confidential
37 RED HAT Confidential
38 RED HAT Confidential
39 RED HAT Confidential
40 RED HAT Confidential
41 RED HAT Confidential
42 RED HAT Confidential
43 RED HAT Confidential
44 RED HAT Confidential
45 RED HAT Confidential
46 RED HAT Confidential
47 RED HAT Confidential
48 RED HAT Confidential
49 RED HAT Confidential
50 RED HAT Confidential
51 RED HAT Confidential
52 RED HAT Confidential
53 RED HAT Confidential
54 RED HAT Confidential
55 RED HAT Confidential
56 RED HAT Confidential
57 RED HAT Confidential
58 RED HAT Confidential
59 RED HAT Confidential
Capture Process Integrate
Why Red Hat for Big Data?
● Transform ALL data into actionable information– Cost Effective, Comprehensive Platform
– Community based Innovation
– Enterprise Class Software and Support
Data
Infor mati on
Data to Actionable Information Cycle
60 RED HAT Confidential
● Red Hat Big Data Platform
HadoopIntegration
JBoss DataVirtualization
RHEL Platform Integration
& Optimization
Hadoop On Red Hat Storage
Storage
Cloud /VirtualizationMiddleware
Hadoop On
OpenStack
Platform
Hadoop
on
FedoraApache Hadoop
FedoraBig Data SIG
Hadoop Distributions
Thank YouQ&A