Upload
m-ahmad-shahzad-pmp
View
183
Download
2
Embed Size (px)
Citation preview
Industrial IoT is a Big Data Problem!Architectural Perspective for Industrial IoT… with a focus on data management.
M. Ahmad ShahzadPrincipal IT Architect
Linkedin: www.linkedin.com/in/ahmads
Twitter: shaxami
IoT is a Big Data Problem!
What is Industrial IoT and Why it’s a big data problem
Detailed Architecture
Architectural Principles
Conceptual Architectural
So, what is Big Data?
Industrial IoT
• Internet of Things – data perspectiveConnectivity of physical world objects as integrated network
Next gen of M2M applications / RFID-based systems
Each node in the network can be uniquely identified
End nodes may be dumb sensors or active computation devices
Information captured without a need for human agent
Data / Information flow is two-way
Industrial IoT is about Business focus
Industrial IoT and Big Data symbiotic relationship
Industrial Internet of Things presents a perfect use case for Big Data
Internet of Things Big DataBillions of sensors, node computers, video/audio/thermal
capturing devices… generating large data setsVolumes
Very high sampling rate for sensors and end-devices Velocity
Different devices and protocols for capturing and
transmitting informationVariety
Big Data
•What is Big DataVariety, Velocity, and Volume
Datasets too big for traditional technologies
Structured and unstructured data sets
Gigabytes, Terabytes, Petabytes, Exabyte, and whatever comes next
There is no single technology
Evolution of Big Data
1950 1960 1970 1980 1990 2000 2010 ….
• Computer Systems
w/ Punch Cards
• Mainframes
• Network
Databases (IMS)
• Relational
Database
• Data Warehouses
• Data Marts
• OLAP/MOLAP
Systems
• Hadoop /
NoSQL /
Columnar /
Graph etc.
Big Data Technologies
•Relational Databases:Rows and column
SQL for data manipulation and access
Database layer separate from application layer
Transactional systems like ERP
Examples are Oracle, IBM DB2, SQL Server, Teradata
Relational
Database
Big Data Technologies
• Data Warehouses and Data MartsSeparate from Operations Database
Based on Relational database technology
Dimensional and Star Schema data models
Batch Integration (ETL)
Primary consumers were reporting, dashboards, scorecards
Other storage mechanisms included: OLAP and MOLAP
BI tools like Cognos, BobJ, Tableau expose data
Enterprise
Data Warehouse
Data Marts
Cubes
Big Data Technologies
• NoSQL:Not a relational database
Key-Value Pairs
Document-stores
Graph-based
ACID compliance is not important
Interaction is based on non-sql languages
Horizontally scalable
Examples are MongoDB, Cassandra, Couchbase, Neo4j, and many many more
NoSQL
Big Data Technologies
• Hadoop:Massively parallel system based on cheep commodity servers
Hadoop = HDFS + MapReduce
Historically, batch oriented operations… lately real-time is becoming reality
Hadoop Echosystem: Spark, Storm, Yarn, Flume, Kafka, Sqoop, Hbase etc.
Big Data Technologies
Distributed storage
(HDFS)
Distributed processing
(MapReduce)
MetaData Services
(H-catalog)
Batch processing(Hive, MapReduce,
Spark)
Dat
a In
teg
rati
on S
erv
ices
(AP
I, S
qoop, F
lum
e, N
FS
)
Man
agem
ent
and
Mo
nit
ori
ng
(Zo
ok
eeper
, C
lou
der
a M
anag
er)
Wo
rkF
low
and
Sch
edu
ling
(Oo
zie)
Interactive
Query
(Impala,
Spark SQL)
Stream
Processing
(spark, storm)
Workload Management (Yarn)
Rea
l-ti
me
pro
cess
ing
(Kaf
ka)
No
n-R
elat
ion
al D
atab
ase
(Hb
ase)
Architectural Guiding Principles
Data and Process locality
Industry standards and protocols
Multiple tools and technologies
Data quality issues are there… handle it
Importance of MDM and Data Governance
Data Plan is important
Security Plan and Architecture
Big Data supports Industrial IoT
• Big Data technologies to support data
processing needs
• Moore’s law for Semiconductors and
reduction in cost for computation
devices
• Advancements in network connectivity
through WiFi, Bluetooth and ZigBee
protocolsBig Data & Analytics
Semiconductor Advancements
Networking and Connectivity
Network and SOA Consideration
SOA Integration
LayerERP
System
Operations Management
Big Data Technologies
Engineering System
Sensors
Cameras
Computer Terminals
Robotics Operators
Netw
ork
Consumption and
Apps Layer
Data Persistence and ProcessingConnectivity
Layer
Data
Origination
Conceptual / Simplistic View
NoSQL
Business
Systems
Relational
Database
Consumption
Enterprise
Hadoop
Cluster
Detailed View
Real-
time
Proc.NoSQL
Log DB
ERP Systems
Supplier Systems
CRM
Mainframe / Legacy
Systems
Batch
Proc.
Enterprise
Data Warehouse
Data Marts
Enterprise Data Governance and MDM
In Memory &
Columnar
Database
Reports &
Dashboards
Ad-hoc Analysis
and Advanced
Analytics
Real-time
Visualization
Data Services
Questions / Discussion…
M. Ahmad ShahzadPrincipal IT Architect
Linkedin: www.linkedin.com/in/ahmads
Twitter: shaxami