LEVERAGING DATA LAKES TO MAXIMIZE IOT...

Preview:

Citation preview

1 © Copyright 2015 EMC Corporation. All rights reserved.

LEVERAGING DATA LAKES TO MAXIMIZE IOT VALUE IOT: WHERE DOES ALL THE DATA GO?

2 © Copyright 2015 EMC Corporation. All rights reserved.

• I'm an Engineer

– I like to solve problems by building things

– BS in Computer Science

• 12 years industry experience – 12 years writing code for money

– 8 years developing Data Lake storage systems

• Expertise – Storage

– Network Protocols

– Computer Security

MY BACKGROUND

3 © Copyright 2015 EMC Corporation. All rights reserved.

* Source: IDC 2011

2005 2015 2010

1.8 trillion gigabytes of

data was created in 2011*

• More than 90% is

unstructured data

• Quantity doubles every 2

years

10,000

0

GB

of

Data

(I

N B

ILL

ION

S)

BIG DATA IS GETTING BIGGER

STRUCTURED DATA

UNSTRUCTURED DATA

©2014 Cloudera, Inc. All rights reserved.

4 © Copyright 2015 EMC Corporation. All rights reserved.

THE INTERNET OF THINGS IS EXPLODING

The impact of the IoT is already visible in the digital universe. Data just from embedded systems – the sensors and systems that monitor the physical universe – already accounts for 2% of the digital universe. By 2020 that will rise to 10%.

5 © Copyright 2015 EMC Corporation. All rights reserved.

IOT CREATING NEW OPPORTUNITIES FOR BUSINESSES

6 © Copyright 2015 EMC Corporation. All rights reserved.

1. IoT produces a lot of data

2. Which will continue to grow exponentially

3. Which comes from lots of things in lots of forms

4. And has tremendous value, but it must be analyzed

5. And the full value is not known up front

ASSUMPTIONS

Where does all the data go?

7 © Copyright 2015 EMC Corporation. All rights reserved.

WHAT IS A DATA LAKE?

"A single scalable repository, storing high fidelity data in its native format, that can be arbitrarily queried.”

8 © Copyright 2015 EMC Corporation. All rights reserved.

• Scalable is more important than big or small – You have some data today, you will need more capacity in

the future

• Migrating data is terrible! – Wasted time and effort

• There will always be upper limits – But limits should be infinite-in-practice for your workflow

– Capacity should be a matter of budget not of capability

CAPACITY SINGLE SCALABLE REPOSITORY

9 © Copyright 2015 EMC Corporation. All rights reserved.

• IoT means lots of devices, from different vendors

– Multiple data sources

– Multiple data formats

• Different operating systems

– Linux, Windows, iOS, Android, QNX, VxWorks, custom

INGEST HIGH FIDELITY NATIVE FORMAT

10 © Copyright 2015 EMC Corporation. All rights reserved.

• Different file access protocols

– SMB, NFS, FTP • files and directory trees

– Object, HTTP, REST • buckets, containers, and objects

• Authentication

– Local users

– Active Directory

– LDAP

INGEST HIGH FIDELITY NATIVE FORMAT

11 © Copyright 2015 EMC Corporation. All rights reserved.

Use a Database

1. Build a database with a rigid schema

2. Build an application to write data to that schema

3. Run queries

ANALYSIS (TRADITIONAL) ARBITRARILY QUERIED

Problems

• Tight coordination needed between all actors

• Full understanding of your goals needed up front

• Limited data fidelity

– Very structured, but not very broad

12 © Copyright 2015 EMC Corporation. All rights reserved.

Hadoop

• THE way to do Big Data Analytics

• Parallel processing of multiple data sets / formats

• Define schema as the data is queried

• Analyze anything, across all your data

ANALYSIS (NEXT GENERATION) ARBITRARILY QUERIED

13 © Copyright 2015 EMC Corporation. All rights reserved.

WHAT IS A DATA LAKE?

"A single scalable repository, storing high fidelity data in its native format, that can be arbitrarily queried.”

14 © Copyright 2015 EMC Corporation. All rights reserved.

EMC ISILON: SCALE-OUT NAS ARCHITECTURE

Gig-e 10 Gig-e Network

OneFS Operating Environment

Clients & Applications

RESTful API GET PUT POST DELETE

Client/Application Layer

Ethernet Layer Multi-Protocol

Protocols

SMB NFS

FTP HTTP

HDFS for

Hadoop

REST for

Object

Intra-cluster Communication

15 © Copyright 2015 EMC Corporation. All rights reserved.

Isilon scales from

16TB to 50PB

in a single file system, single volume cluster • Under 60 seconds to

scale with no downtime

MORE SCALABLE THAN TRADITIONAL STORAGE SYSTEMS EMC ISILON: MASSIVELY SCALABLE

16 © Copyright 2015 EMC Corporation. All rights reserved. 16

FILE

FILE

HPC

Backup/Archive

Analytics

Mobile

File Shares

IoT

EMC ISILON: INGEST

17 © Copyright 2015 EMC Corporation. All rights reserved.

• Only scale-out storage platform with native Hadoop integration.

• In-place analytics – Native integration speeds time to insight

• Certified – Hortonworks commercial Hadoop vendor

integration

• Consulting services – Map IoT data into storage and devise Hadoop

analytics jobs

EMC ISILON: ANALYSIS SCALE-OUT STORAGE WITH NATIVE HADOOP INTEGRATION

19 © Copyright 2015 EMC Corporation. All rights reserved.

EMC Public Safety Data Lake

20 © Copyright 2015 EMC Corporation. All rights reserved.

PUBLIC SAFETY DATA TRENDS NEED FOR SCALABLE DATA REPOSITORIES

Increased camera counts & longer video retention times

Body Camera Proliferation

Expanding City Wide Surveillance Systems

Evidence Management

Video Content Analysis has become essential

1

2

5

3

4

21 © Copyright 2015 EMC Corporation. All rights reserved.

PUBLIC SAFETY SYSTEMS STORAGE CAPACITIES PER CAMERA RESOLUTION

15 Days

30 Days

45 Days

60 Days

0

500

1000

1500

1.5Mbs6Mbps

7.7Mbps9.6Mbps

21Mbps

4CIF1080p

3MPixel5MPixel

10 Mpixel

23.3 97.2 120

160 350

46.6 190 240 320

700 70

290 360 480

1050

94 390 480 640

1400

Capacity (

TB)

For

100 C

am

era

Continuous R

ecord

@ 1

5fp

s

Camera Resolutions and Average Bandwidths

22 © Copyright 2015 EMC Corporation. All rights reserved.

Public Safety Data

Lake

Body Cameras

CCTV

Drones

Satellite Images

License Plate

Capture Audio

In-Car Video

Internet of

Things

Evidence

Pools of Data

23 © Copyright 2015 EMC Corporation. All rights reserved.

Analytics

Security

Application Integration

Body Camera

s

CCTV

Drones

Satellite Images

License Plate

Capture Audio

In-Car Video

Internet of

Things

Evidence

Public Safety Data Lake

24 © Copyright 2015 EMC Corporation. All rights reserved.

“We needed a scalable storage architecture to support the CitySafe project, and the single point of management and load balancing capabilities of Isilon made it a perfect fit for this project.”

Brisbane City Council Australia’s largest council turns to EMC to protect employees and visitors to City Hall

Challenge

Store and make available evidence quality video from cameras at Brisbane City Council's restored City Hall building

Solution

EMC Isilon NL Series

EMC VNXe

Results

Delivered 100% availability

Provided the council and police with high-resolution video and images which can be used in court

Standardized on a world-class enterprise storage platform with robust support

Applications

Genetec Security Centre

PAUL RISHMAN Corporate Security Manager

25 © Copyright 2015 EMC Corporation. All rights reserved.

“The challenge for IT is making sure investigators can quickly get their video no matter how big the data stores have become. A solution that scales without losing performance is imperative, and Isilon has definitely met both those needs.”

Norman Oklahoma Police Department Oklahoma police force fights crime with EMC Isilon and MediaSolv Evidence Management

Challenge

Growing use of video surveillance driving massive data growth

Legacy storage nearing maximum capacity

Solution

EMC Isilon X-series

EMC Isilon SmartQuotas

EMC Isilon SmartConnect

VMware vSphere

Results

Gained scalability and performance to pursue leading-edge video projects

Improved law enforcement with fast, reliable access to evidentiary video

Simplified control over storage usage across different video systems

Increased efficiency of managing fast-growing video assets

Applications

Genetec

MediaSolv

Microsoft SQL Server

KARI MADDEN Network Support Supervisor

Recommended