Upload
juergenurbanski
View
32
Download
1
Embed Size (px)
DESCRIPTION
A Reference Architecture for Big Data in the Enterprise
Citation preview
Copyright [email protected]
A Big Data Reference Architecture Needs to Consider the Following Domains
Core innova2on
Enablers into an enterprise environment
Data Integra;on & Governance
Integrate with exis;ng systems. Move data into, within and out of the environment, while minimizing duplica;on and data movement
Security
Provide layered approach to security that differen;ates internal and external users
Opera;ons
Deploy and manage a
mul; - tenant, environment easily, using exis;ng tools where possible
Environment & Deployment Model Run within your environment
and in a public cloud
3 5 4
6
Data Access Unprecedented insights: Allow simultaneous access by and ;mely insights for all approved users across en;re data lake, using different processing engine and schema on read
Data Management Petabyte scale: Acquire all data in
its original format and store it in one place, cost effec;vely and for very long ;me periods
Presenta;on & Applica;on Enable exis;ng and new
applica;ons
7
1
2
0
Copyright [email protected]
Core Big Data Capabili;es Required
* Includes key value, document, graph and object data bases.
Core innova2ons Security &
Privacy
Opera;ons
Physical Infrastructure
Data Integra;on & Governance
Data Access
(Exis;ng or New) Applica;on
Geo-loca;on Web & Social Media
Machine Learning & Predic;on
Presenta;on
Reports & Dashboards Clients
Extract, Transform, Load
Real Time & Batch Inges;on
Data Connectors
Life Cycle Management
Data Management
Advanced Visualiza;on
Real-Time Monitoring
Text & Seman;cs
Video & Audio
OLAP Data Encryp;on
Data Isola;on & Mul;-tenancy
Iden;ty & Access
Management
Data Masking
Custodian Gateways
SQL Streaming &
Complex Event Processing
Search & Discovery
Graph Processing
Batch Processing
(MPP) Data Warehouse
NoSQL Database*
Rela;onal Database
Distributed Storage
In-memory Compu;ng
Parallel processing (MapReduce)
Commodity HW, cheap storage
Store first, ask ques;ons later (HDFS)
Any data type, incl. unstructured
Google for Big Data
Predic;ons enable Prescrip;ons
Real-;me reasoning on new data
Friends & family social NW analysis
1
Copyright [email protected]
Hadoop and Spark Deliver Many of the Core Innova;ve Capabili;es Required
2
Hadoop Provides The Enterprise-Wide Data Lake
Allows to acquire all data in its original format and store it in one place, cost effec2vely and for very long 2me periods
Allows different processing engines and schema on read Mature mul2-tenancy, opera2ons, security and integra2on
Note: Both are open source technologies supported and embedded by a wide range of so9ware and services vendors
Spark Provides A Modern Development Environment On Top Of Hadoop
In-memory high-speed analy2cs engine Advanced machine learning libraries Unified programming model across all processing engines
+