19
Industrial IoT is a Big Data Problem! Architectural Perspective for Industrial IoT… with a focus on data management. M. Ahmad Shahzad Principal IT Architect Linkedin: www.linkedin.com/in/ahmads Twitter: shaxami

Industrial IoT is a Big Data Problem! Final

Embed Size (px)

Citation preview

Industrial IoT is a Big Data Problem!Architectural Perspective for Industrial IoT… with a focus on data management.

M. Ahmad ShahzadPrincipal IT Architect

Linkedin: www.linkedin.com/in/ahmads

Twitter: shaxami

IoT is a Big Data Problem!

What is Industrial IoT and Why it’s a big data problem

Detailed Architecture

Architectural Principles

Conceptual Architectural

So, what is Big Data?

Industrial IoT

• Internet of Things – data perspectiveConnectivity of physical world objects as integrated network

Next gen of M2M applications / RFID-based systems

Each node in the network can be uniquely identified

End nodes may be dumb sensors or active computation devices

Information captured without a need for human agent

Data / Information flow is two-way

Industrial IoT is about Business focus

Industrial IoT and Big Data symbiotic relationship

Industrial Internet of Things presents a perfect use case for Big Data

Internet of Things Big DataBillions of sensors, node computers, video/audio/thermal

capturing devices… generating large data setsVolumes

Very high sampling rate for sensors and end-devices Velocity

Different devices and protocols for capturing and

transmitting informationVariety

Big Data

•What is Big DataVariety, Velocity, and Volume

Datasets too big for traditional technologies

Structured and unstructured data sets

Gigabytes, Terabytes, Petabytes, Exabyte, and whatever comes next

There is no single technology

Big Data Technologies

NoSQL

Relational

Database

Enterprise

Data Warehouse

Data Marts

Cubes

Evolution of Big Data

1950 1960 1970 1980 1990 2000 2010 ….

• Computer Systems

w/ Punch Cards

• Mainframes

• Network

Databases (IMS)

• Relational

Database

• Data Warehouses

• Data Marts

• OLAP/MOLAP

Systems

• Hadoop /

NoSQL /

Columnar /

Graph etc.

Big Data Technologies

•Relational Databases:Rows and column

SQL for data manipulation and access

Database layer separate from application layer

Transactional systems like ERP

Examples are Oracle, IBM DB2, SQL Server, Teradata

Relational

Database

Big Data Technologies

• Data Warehouses and Data MartsSeparate from Operations Database

Based on Relational database technology

Dimensional and Star Schema data models

Batch Integration (ETL)

Primary consumers were reporting, dashboards, scorecards

Other storage mechanisms included: OLAP and MOLAP

BI tools like Cognos, BobJ, Tableau expose data

Enterprise

Data Warehouse

Data Marts

Cubes

Big Data Technologies

• NoSQL:Not a relational database

Key-Value Pairs

Document-stores

Graph-based

ACID compliance is not important

Interaction is based on non-sql languages

Horizontally scalable

Examples are MongoDB, Cassandra, Couchbase, Neo4j, and many many more

NoSQL

Big Data Technologies

• Hadoop:Massively parallel system based on cheep commodity servers

Hadoop = HDFS + MapReduce

Historically, batch oriented operations… lately real-time is becoming reality

Hadoop Echosystem: Spark, Storm, Yarn, Flume, Kafka, Sqoop, Hbase etc.

Big Data Technologies

Distributed storage

(HDFS)

Distributed processing

(MapReduce)

MetaData Services

(H-catalog)

Batch processing(Hive, MapReduce,

Spark)

Dat

a In

teg

rati

on S

erv

ices

(AP

I, S

qoop, F

lum

e, N

FS

)

Man

agem

ent

and

Mo

nit

ori

ng

(Zo

ok

eeper

, C

lou

der

a M

anag

er)

Wo

rkF

low

and

Sch

edu

ling

(Oo

zie)

Interactive

Query

(Impala,

Spark SQL)

Stream

Processing

(spark, storm)

Workload Management (Yarn)

Rea

l-ti

me

pro

cess

ing

(Kaf

ka)

No

n-R

elat

ion

al D

atab

ase

(Hb

ase)

Architectural Guiding Principles

Data and Process locality

Industry standards and protocols

Multiple tools and technologies

Data quality issues are there… handle it

Importance of MDM and Data Governance

Data Plan is important

Security Plan and Architecture

Big Data supports Industrial IoT

• Big Data technologies to support data

processing needs

• Moore’s law for Semiconductors and

reduction in cost for computation

devices

• Advancements in network connectivity

through WiFi, Bluetooth and ZigBee

protocolsBig Data & Analytics

Semiconductor Advancements

Networking and Connectivity

Network and SOA Consideration

SOA Integration

LayerERP

System

Operations Management

Big Data Technologies

Engineering System

Sensors

Cameras

Computer Terminals

Robotics Operators

Netw

ork

Consumption and

Apps Layer

Data Persistence and ProcessingConnectivity

Layer

Data

Origination

Conceptual / Simplistic View

NoSQL

Business

Systems

Relational

Database

Consumption

Enterprise

Hadoop

Cluster

Detailed View

Real-

time

Proc.NoSQL

Log DB

ERP Systems

Supplier Systems

CRM

Mainframe / Legacy

Systems

Batch

Proc.

Enterprise

Data Warehouse

Data Marts

Enterprise Data Governance and MDM

In Memory &

Columnar

Database

Reports &

Dashboards

Ad-hoc Analysis

and Advanced

Analytics

Real-time

Visualization

Data Services

Concluding Thoughts…

Questions / Discussion…

M. Ahmad ShahzadPrincipal IT Architect

Linkedin: www.linkedin.com/in/ahmads

Twitter: shaxami