3
Copyright [email protected] A Big Data Reference Architecture Needs to Consider the Following Domains Core innova2on Enablers into an enterprise environment Data Integra;on & Governance Integrate with exis;ng systems. Move data into, within and out of the environment, while minimizing duplica;on and data movement Security Provide layered approach to security that differen;ates internal and external users Opera;ons Deploy and manage a mul; tenant, environment easily, using exis;ng tools where possible Environment & Deployment Model Run within your environment and in a public cloud 3 5 4 6 Data Access Unprecedented insights: Allow simultaneous access by and ;mely insights for all approved users across en;re data lake, using different processing engine and schema on read Data Management Petabyte scale: Acquire all data in its original format and store it in one place, cost effec;vely and for very long ;me periods Presenta;on & Applica;on Enable exis;ng and new applica;ons 7 1 2 0

Reference Architecture Big Data

Embed Size (px)

DESCRIPTION

A Reference Architecture for Big Data in the Enterprise

Citation preview

  • Copyright [email protected]

    A Big Data Reference Architecture Needs to Consider the Following Domains

    Core innova2on

    Enablers into an enterprise environment

    Data Integra;on & Governance

    Integrate with exis;ng systems. Move data into, within and out of the environment, while minimizing duplica;on and data movement

    Security

    Provide layered approach to security that differen;ates internal and external users

    Opera;ons

    Deploy and manage a

    mul; - tenant, environment easily, using exis;ng tools where possible

    Environment & Deployment Model Run within your environment

    and in a public cloud

    3 5 4

    6

    Data Access Unprecedented insights: Allow simultaneous access by and ;mely insights for all approved users across en;re data lake, using different processing engine and schema on read

    Data Management Petabyte scale: Acquire all data in

    its original format and store it in one place, cost effec;vely and for very long ;me periods

    Presenta;on & Applica;on Enable exis;ng and new

    applica;ons

    7

    1

    2

    0

  • Copyright [email protected]

    Core Big Data Capabili;es Required

    * Includes key value, document, graph and object data bases.

    Core innova2ons Security &

    Privacy

    Opera;ons

    Physical Infrastructure

    Data Integra;on & Governance

    Data Access

    (Exis;ng or New) Applica;on

    Geo-loca;on Web & Social Media

    Machine Learning & Predic;on

    Presenta;on

    Reports & Dashboards Clients

    Extract, Transform, Load

    Real Time & Batch Inges;on

    Data Connectors

    Life Cycle Management

    Data Management

    Advanced Visualiza;on

    Real-Time Monitoring

    Text & Seman;cs

    Video & Audio

    OLAP Data Encryp;on

    Data Isola;on & Mul;-tenancy

    Iden;ty & Access

    Management

    Data Masking

    Custodian Gateways

    SQL Streaming &

    Complex Event Processing

    Search & Discovery

    Graph Processing

    Batch Processing

    (MPP) Data Warehouse

    NoSQL Database*

    Rela;onal Database

    Distributed Storage

    In-memory Compu;ng

    Parallel processing (MapReduce)

    Commodity HW, cheap storage

    Store first, ask ques;ons later (HDFS)

    Any data type, incl. unstructured

    Google for Big Data

    Predic;ons enable Prescrip;ons

    Real-;me reasoning on new data

    Friends & family social NW analysis

    1

  • Copyright [email protected]

    Hadoop and Spark Deliver Many of the Core Innova;ve Capabili;es Required

    2

    Hadoop Provides The Enterprise-Wide Data Lake

    Allows to acquire all data in its original format and store it in one place, cost effec2vely and for very long 2me periods

    Allows different processing engines and schema on read Mature mul2-tenancy, opera2ons, security and integra2on

    Note: Both are open source technologies supported and embedded by a wide range of so9ware and services vendors

    Spark Provides A Modern Development Environment On Top Of Hadoop

    In-memory high-speed analy2cs engine Advanced machine learning libraries Unified programming model across all processing engines

    +