View
108
Download
3
Tags:
Embed Size (px)
DESCRIPTION
The Jupiter Platform aims to combine SAP HANA with other databases such as Sybase IQ and Sybase ASE, and other technologies such as Hadoop for unstructured data to provide a complete integrated solution for handling all data types and volumes.
Citation preview
1 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Reducing the Risk and Complexity of
SAP Big Data Deployments
- Jupiter Platform
Cenk Ersoy Advisory Systems Engineer, EMC Corp SAP SME – Middle East, Africa &Eastern Europe [email protected]
2 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Big Data Definition
HANA
Trad. DB
Unstructured Data
Hadoop
Variety of Applications
Media
BOBJ
SAS
Analytics
Cloud Data
Social
Not all Data is created equally
3 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
SAP and Big Data
Combination of ERP data and big data enables powerful new use cases
Federation of structured and unstructured data is key
Myriad of platforms, tools, and skills is typically the challenge
BW ECC CRM
Hadoop Social
SMS Mobile records sensor data trade data
Media
Big Data
HANA
Etc…
4 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
The Time Value of DATA
Current Data is in general more valuable to the business (Hot)
Older data is still useful (Warm, cold)
Varies by organization, time, and data type
Warm/cold data continues to grow and presents the real challenge to manage
5 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Analytics
Data Governance
Data Sources
Big Data Platform
Tools & Consulting
MASTER DATA MANAGEMENT
D A T A QUALITY
PLANNING / WHAT-IF PREDICTIVE T E X T
ANALYTICS
UNSTRUCTURED STRUCTURED
D A T A INTEGRATION
DATA SCIENTIST
VISUALIZATION REPORTING DASHBOARDS
COLLABORATION
TRAINING SOLUTION DESIGN/
DEVELOPMENT
OT H E R S TRANSACTIONAL
D A T A MACHINE
DATA SOCIAL EXTERNAL
Big Data Analytics– Where does Jupiter Fit In?
6 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Data Lake for SAP
Jupiter Integrates
Flexible and fast ETL
Federated Query
EMC/VMware software
BMMsoft software
EMC/VCE hardware
7 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Data
Jupiter In Essence
• Big Data Solution for SAP HOT/WARM/COLD data
• Utilizing existing components and partners
• Big Data reference architecture including infrastructure
• Simplifies access to multiple data sources and types
• Excellent platform for very high data volume and velocity use cases
Hot Warm Cold
HANA SybaseIQ HADOOP
Storage
Data Protection
Disaster Tolerance
Storage Storage
Compute Compute Compute
Network
8 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
EMC ETL Storage
SAP IQ, SAP HANA, Netezza,
Oracle, Exadata, Hadoop
Data Management, Access Control, Alerts, Auto-Classification,
Collaboration, Taxonomy, Data Retention, Connectivity, Search API
ED
MT
AP
I &
Con
ne
cto
rs
E
D
M
T
ETL
(INGEST) Real-time ETL Parser, Metadata
Manager, Parallel Loader
EMC DB Storage
Cisco blades Cisco blades
DB Servers ETL & App Servers
EDMT Data Access & Analysis Layer EDMT GUI
Web Services
Data Export
Proxy Mobile GUI
eDiscovery, Audit, Fraud
Modules
Social Net Analysis
Jupiter Architecture
Data
Access
Soft
ware
&
Data
base
Infr
astr
uctu
re
EMC HANA Storage
Cisco blades
HANA scale out
EMC unified Data Protection
EMC unified Disaster Tolerance
VMware, VMware & EMC Operational reporting
9 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
EDMT Visualizatio
n
EDMT Data Access & Analysis Layer EDMT GUI
Web Services
Data Export Proxy Mobile
GUI
eDiscovery, Audit, Fraud
Modules
Social Net
Analysis
Packaged Solution – Big Data in a box
Sybase IQ Storage Sybase IQ Compute
HANA Storage HANA Compute
10 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Hadoop is optional, but can be used for cold storage
Unified Data Protection: EMC Data Domain
Unified Disaster Tolerance: EMC RecoverPoint (Synch & Asynch)
RSA And
Archer GRC
Optional
Jupiter
Advanced SI services
11 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Jupiter Class: 23 PB loaded in 7 days
Load time =per
HOUR
per
DAY
TB per
HOUR
PB per
DAY
2,298 55,164 386,148 11
6.5 156 1,089 2
0.7 17 122 4
0.4 9 62 2
2,306 55,346 387,421 19
Email/SMS/twt/blog (metadata+BLOB)
TOTAL
Data Types
(from 30 Billion sources)
22.93.3136
7 days
Smart meters, sensors (KV2/4) with EDMT UCM
Enterprise DB records (OLTP, DW, SAP ERP etc.)
Files (metadata + BLOB)
23 PB Jupiter-class DW loaded in 7 days (FINAL)
Loading speed
(Billion Rows)Total Rows
Loaded
(Billion)
Data Loading
speedTotal Data
Loaded
(PB)
#
servers
12 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Jupiter 3-Million channel ETL “Reaction” starts here
DB Server ( SW+DB_HW )
1,000,000–channel Ingest engine for SQL data +100B_rows/day/channel
1,000,000-channel Ingest engine for Emails,
SMS, IM etc. +1TB/day/channel Node 1,000
“Event” occurs here
Event-to-Reaction Tim =0.2-2 sec
Node 1
Node 1,000
Node 1
Node 1
Node 1,000
1,000,000-channel Ingest engine for Docs and
Multimedia(aud/vid/img) +1TB/day/channel
13 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Federated Jupiter(*): HANA, IQ, Other DBs
HANA
(MPP Shared Nothing)
IQ Multiplex
(Multi-node MPP Shared disk)
Disk storage
SAN
EDMT® Federated
ETL using
federated data model
EDMT® Federated
Query: (federated query,
final merge, metadata
management, retention, access
control, partitioning,
HA/DR, replication…) Other DBs : Oracle, Exadata, Netezza,
Sybase ASE, MySQL, Hadoop
14 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Location Independent Jupiter Federation
Oracle (opt.)
Federated SQL Query Access
Flexible ETL
SAP IQ
My SQL
SAP HANA
15 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Project Jupiter Benefits Customers
• Right sized HANA for HOT data
• Sybase IQ for warm and cold data
• Right investment for the right data
• Complete solution
• Measured by $/TB
• Low TCO
• Enables BI analyst to become Data Scientist
• Can still utilize SAP analytics tools (BOBJ) as well as Oracle, IBM, SAS , Microsoft…
• HANA for analytics introduces the least amount of risk compared to BW and Suite
16 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Project Jupiter Services • Business Alignment
• Value Discovery
• Value Realization
• Feasibility Study
• PoC
• Data Integration
• Data Migration
• Data Replication
• Landscape Integration
• End-to-end solution support
Plan Build Run
17 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Federated Jupiter– The Benefits in Detail
1. Higher Data Scalability, Ingest speed – When single-DB CAN’T handle data size (i.e. +2,000 PB) or ingest speed (+PB/hour)
2. Geo distances – Not limited by LAN/SAN/Ethernet distance – WAN is OK
3. Selective Replication for HA, DR – Controlled replication for HA, DR and B/R purposes
4. Hetero-DB – Include hetero-DB in the ETL and Fed_query
5. Operational requirements/benefits – Upgrades, life-cycle management etc.
6. Fed Jupiter ETL/Query knows the purpose/config. of ALL Fed “members”
18 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
What Problems is Jupiter Solving ?
1. Data Volume: Jupiter has Multi-PB scalability - Certified to 10 PB – new benchmark ongoing
2. Data Velocity: Jupiter has Multi-PB/day loading speed
3. Data Variety: Jupiter stores “std” SQL data and unstructured data (sensor, files, email/sms, social net, multimedia..) in single RDBMS(**)
4. Value (Low Cost): Jupiter is price competitive with Big Data solutions based on Hadoop
5. Jupiter is different from “new” solutions
1. Uses proven Enterprise SW+HW: SAP DBs (HANA, IQ, ASE etc.) and VCE Vblock
2. Jupiter is fully compatible with enterprise DBs, Apps and Reporting/Analytic tools
3. Jupiter is more reliable because it uses proven enterprise SW+HW
4. Unifies struct.+unstruct. while “new solutions” process unstructured data only
5. Much lower cost-per-TB than “std” enterprise apps and lower price than “new” solutions
19 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Super Scale
Extreme Speed
Lower Cost
Simplified
✔
✔
✔
✔
Jupiter Platform - Summary
20 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved.
Thank you