25
DIGITS DEEP LEARNING GPU TRAINING SYSTEM Allison Gray

DIGITS DEEP LEARNING GPU TRAINING SYSTEM · System Management . 10 cuDNN Accelerates key routines to improve performance of neural net training Routines for convolution and cross-correlation

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

DIGITS

DEEP LEARNING GPU TRAINING SYSTEM Allison Gray

2

PRACTICAL DEEP LEARNING EXAMPLES

Image Classification, Object Detection, Localization, Action Recognition, Scene Understanding

Speech Recognition, Speech Translation, Natural Language Processing

Pedestrian Detection, Traffic Sign Recognition Breast Cancer Cell Mitosis Detection, Volumetric Brain Image Segmentation

4

WHAT IS DEEP LEARNING?

Image “Volvo XC90”

Image source: “Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks” ICML 2009 & Comm. ACM 2011. Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Ng.

5

Tree

Cat

Dog

Deep Learning Framework

“turtle”

Forward Propagation

Compute weight update to nudge

from “turtle” towards “dog”

Backward Propagation

Trained Model

“dog”

Repeat

Training

Classification

IMAGE CLASSIFICATION WITH DNN

6

IMAGE CLASSIFICATION WITH DNNS

Typical training run

Pick a DNN design

Input thousands to millions training images spanning 1,000 or more categories

Day to a week of computation

Test accuracy

If bad: modify DNN, fix training set or update training parameters

Tree

Cat

Dog

Forward Propagation

Compute weight update to nudge

from “turtle” towards “dog”

Backward Propagation

Training

7

72%

74%

84%

88%

93%

2010 2011 2012 2013 2014

ImageNet Challenge

GPUs and Deep Learning

GPUs deliver -- - same or better prediction accuracy - faster results - smaller footprint - lower power

Accuracy

NVIDIA CUDA GPU

Neural Networks GPUs

Inherently

Parallel Matrix

Operations

FLOPS

8

DEPLOYMENT

Hardware Systems Software

NVIDIA DEEP LEARNING PLATFORM

DEVELOPMENT

Systems Software Hardware

Titan X Tesla DIGITS DevBox cuDNN

Applications

DIGITS Tools

Deep Learning Frameworks

System

Management

10

cuDNN

Accelerates key routines to improve performance of neural net training

Routines for convolution and cross-correlation as well as activation functions

Up to 1.8x faster on AlexNet than a baseline GPU implementation

Integrated into all major Deep Learning frameworks: Caffe, Theano, Torch

1.0x 1.0x

1.6x

1.2x

Caffe (GoogLeNet) Torch (OverFeat)

Baseline (GPU)

With cuDNN

2.5M

18M

23M

43M

0

10

20

30

40

50

16 Core CPU GTX Titan Titan BlackcuDNN v1

Titan XcuDNN v2

Millions

of

Images

Images Trained Per Day (Caffe AlexNet)

E5-2698 v3 @ 2.3GHz / 3.6GHz Turbo

11

GPU-ACCELERATED DEEP LEARNING FRAMEWORKS

CAFFE TORCH THEANO CUDA-CONVNET2 KALDI

Domain Deep Learning

Framework

Scientific Computing

Framework

Math Expression

Compiler

Deep Learning

Application

Speech Recognition

Toolkit

cuDNN R2 R2 R2 -- --

Multi-GPU (nnet2)

Multi-CPU (nnet2)

License BSD-2 GPL BSD Apache 2.0 Apache 2.0

Interface(s) Text-based definition

files, Python, MATLAB Python, Lua, MATLAB Python C++ C++, Shell scripts

Embedded (TK1)

http://developer.nvidia.com/deeplearning

12

Development Deployment

PRODUCTION PIPELINE

Deploy

Model Classification

Detection

Segmentation Train

Network

Solver

Dashboard

13

NVIDIA® DIGITSTM

Interactive Deep Learning GPU Training System

Dashboard Real time monitoring Network Visualization

14

NVIDIA® DIGITSTM

Data Scientists & Researchers:

Quickly design the best deep neural network (DNN) for your data

Visually monitor DNN training quality in real-time

Manage training of many DNNs in parallel on multi-GPU systems

Open source!

https://developer.nvidia.com/digits

Interactive Deep Learning GPU Training System

15

DIGITS WORKFLOW

CREATE YOUR DATABASE

CONFIGURE YOUR MODEL

Choose a default network, modify

one, or create your own

Main Console

Create your dataset

Configure your Network

Start Training

Choose your database

16

Image parameter

options

Create your dataset

Insert the path to your train

and validation set

DIGITS can automatically create your

training and validation set

OR

OR use a URL list

CREATING THE DATABASE

17

CREATE THE DATABASE

Image directory

on host machine

images

truck cars

images

person house

planes dogs

images

DIGITS creates your training

and validation set for you.

Insert the path to your images here

cats bikes

18

CREATE THE DATABASE

Create Training and Validation Set

Validation

images

truck cars

images

person house

planes dogs

Training

bikes cats

19 CREATE THE DATABASE

train

val

Apply rotation, color distortion,

noise to training set

planes cats

20

CREATING THE DATABASE

Training and validation data set information

Job status

information displays

here – progress,

completion, and

errors

Category data information is

posted

21

NETWORK CONFIGURATION

Insert your network here

Choose a preconfigured network

OR choose a previous configuration

Start training

Select training dataset

GPU Resources

• Load one of your pretrained networks

• Create a your own network • Customize a network • Load a pretrained network and fine tune it

22

NETWORK CONFIGURATION

Select a standard network and start training

OR

Customize a Standard Network

Select training dataset

Make network changes here

23

Select training dataset

Make network changes here

NETWORK CONFIGURATION

Select a standard network and start training

OR

Customize a Standard Network

Visualize your network

Choose a preconfigured network

Select training dataset

Make network changes here

24

Select training dataset

Make network changes here

NETWORK CONFIGURATION

Start training

Select a standard network and start training

OR

Customize a Standard Network

Visualize your network

Start training

25

TRAINING

Download network files

Training status

Accuracy and loss

values during

training

Learning rate

Classification on the

with the network

snapshots

Classification

Visualize DNN performance in real time

Compare networks

26

COMPARE RESULTS

27

PRESENTATION LINEUP

Monday

4 - Introduction to Graph Analytics

Tuesday

2 - Performance Testing in Virtual Environments

3 – Leverage GPUs for Image Processing with ENVI

Wednesday

1 – Accelerating FMV PED Workflows with Real-Time Image Processing

3 – Legion: A CUDA-based Engine for Geospatial Analytics

4 DL Open House

Thursday

1:30 – Accelerating Visualization and Analytics in Socet GXP with GPUs

27