18
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Deep Learning using Spark and DL4J for fun and profit Adam Gibson and Dhruv Kumar 2015 Version 1.0

Deep Learning using Spark and DL4J for fun and profit

Embed Size (px)

Citation preview

Page 1: Deep Learning using Spark and DL4J for fun and profit

Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Deep Learning using Spark and DL4J for fun and profit

Adam Gibson and Dhruv Kumar

2015Version 1.0

Page 2: Deep Learning using Spark and DL4J for fun and profit

Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Who are we?

Adam Gibson- Co founder of Skymind - Wrote DeepLearning4J, ND4J

Dhruv Kumar- Sr Solutions Architect, HWX- MS Umass, Mahout, ASF

Page 3: Deep Learning using Spark and DL4J for fun and profit

Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

In this talk

- What’s Deep Learning?- Architectures - Implementation and Libraries in Real Life- Demo!

Page 4: Deep Learning using Spark and DL4J for fun and profit

Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Deep Learning

• One of the many pattern recognition techniques in Data Science

• Excels at rich media applications:• Image recognition• Speech translation• Voice recognition

• Loosely inspired by human brain models• Synonymous with Artificial Neural Networks, Multi Layer

Networks

Page 5: Deep Learning using Spark and DL4J for fun and profit

Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Enterprise use cases

Page 6: Deep Learning using Spark and DL4J for fun and profit

Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Doing this in real life for enterprise

Page 7: Deep Learning using Spark and DL4J for fun and profit

Page 7 © Hortonworks Inc. 2011 – 2014. All Rights ReservedPage 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

HDP FOR DATA AT REST

HDF FOR DATA IN MOTION

ACTIONABLEINTELLIGENCE

MODERN DATA APPSModern Data Applications in Enterprise: Connected, Fast, Intelligent

PERISHABLE INSIGHTS

HISTORICAL INSIGHTS

INTERNETOF

ANYTHING

Page 8: Deep Learning using Spark and DL4J for fun and profit

How do we realize MDA in a Hadoop Centric World?

HDF

Hadoop

HDFS

HBase Hive SOLR

YARN

Storm

Service Management /

Workflow

SIEM

Spark

Raw Network Stream

Network Metadata Stream

Data Stores

Syslog

Raw Application Logs

Other Streaming Telemetry

Page 9: Deep Learning using Spark and DL4J for fun and profit

www.hortonworks.com

NiFi 1

NiFi 2

Storm 1 Kafka 1

Storm 2 Kafka 2

Storm 3 Kafka 3

DataNode 1 HBase 1

Source 1

Source 2

Source 3

Source N

NiFi Nodes

Edge Nodes

Master NodesClients 1

Clients 2

DataNode 2 Hbase 2

DataNode 3 Hbase 3

DataNode 4 Hbase 4

DataNode 5 Hbase 5

DataNode 6 Hbase 6

DataNode 7 Hbase 7

DataNode 8 Hbase 8

DataNode 9 DataNode 10

DataNode 31 DataNode 32

Master 1

Master 2

Master 3

Master 4

Master 5

Worker Nodes

HDF

HDP

World Azure

Page 10: Deep Learning using Spark and DL4J for fun and profit

Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Storm/Spark Streaming

Storm

Detailed Reference Architecture

HDF

Flume

Sink toHDFS

Transform

Interactive

UI Framework

Hive

Hive

HDFS

HDFS

SOURCE DATA

Server logs

Application Logs

Firewall Logs

CRM/ERP

Sensor

Kafka

Kafka

Stream toHDF

Forward to Storm

Real Time Storage

Spark-ML

Pig

Alerts

Bolt toHDFS

Dashboard

Silk

JMSAlerts

Hive Server

HiveServer

Reporting

BI Tools

High Speed Ingest

Real-Time

Batch Interactive

Machine LearningModels

Spark

Pig

Alerts SQOOP

Flume

Iterative ML

Hbase/Pheonix

HBaseEvent Enrichment

Spark-Thrift

Pig

Page 11: Deep Learning using Spark and DL4J for fun and profit

Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 11

For Model Building: Typical Workflow

1. Ingest training data and store it2.Split data set into: training, testing and validation sets3.Vectorize and extract features to go into next step4.Architect multi layer network, initialize5.Feed data and train6.Test and Validate7.Repeat steps 4 and 5 until desired8.Store model9.Put model in app, start generalizing on real data.

Page 12: Deep Learning using Spark and DL4J for fun and profit

Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 12

So what do you get?

1. Ingest training data and store it using Nifi or other ingest tools2.Split data set into: training, testing and validation sets3.Vectorize and extract features to go into next step4.Architect multi layer network, initialize5.Feed data and train6.Test and Validate7.Repeat steps 4 and 5 until desired8.Store model9.Put model in app, start generalizing on real data.

Steps 2, 3, 4 and 5: Use libraries such as Deeplearning4j

Page 13: Deep Learning using Spark and DL4J for fun and profit

Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 13

Deeplearning4j Architecture

Page 14: Deep Learning using Spark and DL4J for fun and profit

Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 14

DL4J: Canova for Vectorization and Ingest

• Canova uses an input/output format system (similar to how Hadoop uses MapReduce)

• Supports all major types of input data (text, CSV, audio, image and video)

• Can be extended for specialized input formats• Connects to Kafka

Page 15: Deep Learning using Spark and DL4J for fun and profit

Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 15

ND4J:

• N-dimensional vector library• Scientific computing for JVM• DL4J uses it to do linear algebra for backpropagation• Supports GPUs via CUDA and Native via Jblas • Deploys on Android• DL4J code remains unchanged whether using GPU or

CPU

Page 16: Deep Learning using Spark and DL4J for fun and profit

Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 16

How to chose a Neural Net in DL4J core?

Page 17: Deep Learning using Spark and DL4J for fun and profit

Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Demo!

Page 18: Deep Learning using Spark and DL4J for fun and profit

Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Thank Youhortonworks.com