67
INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation [email protected]

INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation [email protected]

  • Upload
    others

  • View
    40

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

INTRODUCTION TO DEEP

LEARNING

Dmytro Mishkin

Czech Technical University in Prague

Clear Research Corporation

[email protected]

Page 2: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

MY BACKGROUND

CTO of Clear Research. Using deep learning

at work since 2014.

PhD student of Czech Technical university in

Prague. Beat Deep Learning approaches at

VPRiCE Challenge 2015 with classical

methods

Now fully work in DL, recent paper “All you

need is a good init” added to Stanford CS231n

course.

Kaggler. 9th out of 1049 teams at National

Data Science Bowl

Page 3: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications

and motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based image

classification. State-of-art approaches

Deep Learning libraries overview

How to apply CNNs to different tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework

Page 4: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

XKCD. NOT TRUE ANYMORE

Page 5: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEEP LEARNING APPLICATIONS

Alpha Go :)

Image recognition

Speech Recognition. Cortana, Siri

Translation

Anomaly detection

Fraud detection

Video recognition

Robotics

Recommendation systems

DNA, biology, and more..

Page 6: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

ALPHAGO

Mastering the game of Go with deep neural networks and tree search Silver et.al 2016

Page 7: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

IMAGE CLASSIFICATION

Select all dogs. Our assignment…almost :)

State-of-art since 2012. Krizhevsky et.al 2012Superhuman level an ImageNet classification since 2015.

He et.al 2015, Szegedy et.al 2015

Page 8: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

OBJECT DETECTION

Page 9: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

SPEECH RECOGNITION

Cortana

Siri

OK, Google

Figure from Huang et.al. 2015.

Page 10: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

ANOMALY DETECTION

Page 11: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

VIDEO CAPTIONING

Translating Videos to Natural Language Using Deep Recurrent Neural Networks. Venugopalan et.al. 2015

Page 12: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

TEXT TRANSLATION

From [Bahadanau et al., 2015] slides at ICLR 2015.

Page 13: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEEP LEARNING FRAMEWORKS FOR

REGULATORY GENOMICS AND EPIGENOMICS

https://www.youtube.com/watch?v=2vpKB3j-OY0

Page 14: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

ROBOTICS: NAVIGATION

https://www.youtube.com/watch?v=umRdt3zGgpU

Page 15: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

FRAUD DETECTION

As simple classificationhttp://www.slideshare.net/0xdata/

paypal-fraud-detection-with-deep-learning-in-h2o-presentationh2oworld2014

Page 16: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications and

motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based image

classification. State-of-art approaches

Deep Learning libraries overview

How to apply CNNs to different tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework

Page 17: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DL IS NOT THE BEST CHOICE WHEN

You have little number of heterogenous of

(enumeration) features.

E.g. almost all kaggle competitions:

Given browser, session id, gender, determine if

customer wants revenge :)

Given some anonymized features, predict stock paper

price

Given gender, profession, age, etc. predict insurance

risk

Как нафармить рейтинг на Хабре (sorry :)

Page 18: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

WHAT IS COMMON IN DL-FRIENDLY

TASKS?

Extremely hard to explicit write algorithms

Even if features are obvious – how to extract

them?

Lots of structured homogenous data (image,

speech , text).

You can and have to transform input. Could you

transform browser version?

Page 19: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEEP LEARNING IS HIERARCHICAL

REPRESENTATION LEARNING

Quoc.V.Le et.al.,2011. Building high-level features using large scale unsupervised learning

Page 20: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEPTH IS ESSENTIAL IN DEEP LEARNING

Page 21: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEPTH IS ESSENTIAL IN DEEP LEARNING

Page 22: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEPTH IS ESSENTIAL IN DEEP LEARNING

Page 23: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEPTH IS ESSENTIAL IN DEEP LEARNING

Page 24: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications and

motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based image

classification. State-of-art approaches

Deep Learning libraries overview

How to apply CNNs to different tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework

Page 25: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

CONVOLUTIONS? WHY NOT JUST MLP?

http://cs231n.github.io/convolutional-networks/

NN

CNN

Page 26: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

WHAT IS CONVOLUTION

https://developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html

Classical NN

for image

is convolution

with image

size kernel

Page 27: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

CONVOLUTIONS? WHY NOT JUST MLP?

Let`s look on filters:

Local

The most values are mean

(non-informative)

Wasted computation

and memory!!!

Also lots of parameters

and low data -> overfitting

http://www.cs.toronto.edu/~ranzato/research/projects.html

Page 28: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

CONVOLUTIONS? WHY NOT JUST MLP?

Krizhevsky et al. 2012. conv1 filters

11x11x3. Much less wasted space!

Page 29: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DO WE NEED MATH?

If yes, go to the whiteboard

Page 30: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

POOLING

http://cs231n.github.io/convolutional-networks/

Page 31: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

MAX POOLING

http://cs231n.github.io/convolutional-networks/

Page 32: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

TYPICAL CNN STRUCTURE (LENET-5)

http://eblearn.sourceforge.net/lib/exe/lenet5.png

• (Conv-ReLU-Pool)xN Softmax. Simple• (Conv-ReLU)xN-Pool- (Conv-Relu)x2N-Pool….Softmax. Popular.

• Some Inception arch. Have fun :)

Page 33: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

NON-LINEARITIES

Page 34: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

NON-LINEARITIES BENCHMARK

https://github.com/ducha-aiki/caffenet-benchmark/blob/master/Activations.md

Page 35: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

NON-LINEARITIES

Page 36: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications and

motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based

image classification. State-of-art

approaches

Deep Learning libraries overview

How to apply CNNs to different tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework

Page 37: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

IMAGE PREPROCESSING

Subtract mean pixel (training set), divide by std.

RGB is the best colorspace for CNN

Do nothing more…

…unless you have specific dataset.

Subtract local mean pixelB.Graham, 2015

Kaggle Diabetic Retinopathy Competition report

Page 38: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

TRAINING. SOLVERS AND REGULARIZATION

Use SGD with momentum.

Try learning rates 0.01, 0.005, 0.001

Momentums: 0, 0.5, 0.9, 0.95

Try L2 weight decay 0.0005, 0.0001. Prevents

from overconfidence

Fancy solvers (ADAM, RMSProp, AdaDelta)

sometimes work better, sometimes not.

Page 39: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

ARCHITECTURE

Use as small filters as possible

3x3 + ReLU + 3x3 + ReLU > 5x5 + ReLU.

Exception: 1st layer. Too computationally

ineffective to use 3x3 there.

Convolutional Neural Networks at Constrained Time Cost. He and Sun, 2015

Page 40: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

WEIGHTS INITIALIZATION

Preserve var=1

of all layers

output.

How?

There are lots of

papers with

variants

Mishkin and Matas. All you need is a good init. ICLR, 2016

Page 41: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

WEIGHTS INITIALIZATION

Gaussian noise with some coefficient:

Xavier:

He (0.5 * Xavier for ReLU)

Orthonormal (Saxe et.al. 2013)

Data-dependent: LSUV

Mishkin and Matas. All you need is a good init. ICLR, 2016

Page 42: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

BATCH NORMALIZATION

Ioffe et.al 2015

Page 43: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

BATCH NORMALIZATION

Page 44: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DROPOUT

Page 45: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DROPOUT

Play with rates. 0.5 is rarely optimal choice (but

often good)

Page 46: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DROPOUT

Dropout_rate * width = constant – doesn`t work!

Page 47: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DATA AUGMENTATION

Common (helps 99% cases):

Random crop: e.g., 227x227 from 256x256 px

(AlexNet)

Horizontal mirror

Dataset dependent:

Random rotation

Affine transform

Random scale

Color augmentation

Noise input

Thin plate deformation

Unleash your imagination

Page 48: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

PADDING. VALID AND SAME CONVOLUTION

http://www.johnloomis.org/ece563/notes/filter/conv/convolution.html

Same = padding with zerosby ½ kernel size.

The most common choice

Page 49: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

PADDING

Padding:

Preserving spatial size, not “washing out”

information

Dropout-like augmentation by zeros

Caffenet128

with conv padding: 47% top-1 acc

w/o conv padding: 41% top-1 acc.

It is huge difference

Page 50: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

RESUME FROM CS231N

Page 51: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications and

motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based image

classification. State-of-art approaches

Deep Learning libraries overview. Why

caffe.

How to apply CNNs to different tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework

Page 52: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DEEP LEARNING TOOLBOXES

Caffe

Torch

Theano

TensorFlow

MXNet

Nervana

DeepLearning4j

ConvnetJS

CNTK

Veles

H20...sorry, guys :(

Page 53: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

MAIN DEEP LEARNING TOOLBOXES

Page 54: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

SPEED BENCHMARK ALEXNET

Library Class Time (ms) forward (ms) backward (ms)

CuDNN[R4]-fp16

(Torch)cudnn.SpatialConvolution 71 25 46

Nervana-neon-fp16 ConvLayer 78 25 52

CuDNN[R4]-fp32

(Torch)cudnn.SpatialConvolution 81 27 53

Nervana-neon-fp32 ConvLayer 87 28 58

fbfft (Torch) fbnn.SpatialConvolution 104 31 72

TensorFlow conv2d 151 34 117

Chainer Convolution2D 177 40 136

cudaconvnet2* ConvLayer 177 42 135

CuDNN[R2] * cudnn.SpatialConvolution 231 70 161

Caffe (native) ConvolutionLayer 324 121 203

Torch-7 (native) SpatialConvolutionMM 342 132 210

CL-nn (Torch) SpatialConvolutionMM 963 388 574

Caffe-CLGreenTea ConvolutionLayer 1442 210 1232

https://github.com/soumith/convnet-benchmarks

Page 55: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

SPEEDBENCHMARK. GOOGLENET

Library Class Time (ms)forward

(ms)

backward

(ms)

Nervana-

neon-fp16ConvLayer 230 72 157

Nervana-

neon-fp32ConvLayer 270 84 186

CuDNN[R4]-

fp16 (Torch)

cudnn.Spatial

Convolution462 112 349

CuDNN[R4]-

fp32 (Torch)

cudnn.Spatial

Convolution470 130 340

ChainerConvolution2

D687 189 497

TensorFlow conv2d 905 187 718

CaffeConvolutionL

ayer1935 786 1148

CL-nn (Torch)SpatialConvol

utionMM7016 3027 3988

Caffe-

CLGreenTea

ConvolutionL

ayer9462 746 8716

https://github.com/soumith/convnet-benchmarks

Page 56: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

CAFFE

Page 57: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

CAFFE

Page 58: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications and

motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based image

classification. State-of-art approaches

Deep Learning libraries overview

How to apply CNNs to various tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework.

Page 59: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

HOW TO DO – LET`S GO TO WHITEBOARD

Image retrieval Babenko et. al (2014)

Person identification Chopra et. al 2006

Ranking Wang et.al 2014

Playing games. Atari (2013) Go (2016)

Text generation https://github.com/karpathy/char-rnn

Image generation Radford et.al 2016

Action recognition Simonyan et.al 2014

Anomaly detection https://www.youtube.com/watch?v=ds73ULGjnpc&feature=youtu.be

Translation Cho et al 2014

Fraud detection at PayPal http://university.h2o.ai/cds-lp/cds02.html

Page 60: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AGENDA

Why deep learning (DL)? Some applications and

motivations

What is the core idea behind DL?

Basics of convolutional networks (CNN)

Practical recommendation for CNN-based image

classification. State-of-art approaches

Deep Learning libraries overview

How to apply CNNs to different tasks

EC2 hands-on experience on Cats-vs-Dogs

competition. Homework.

Page 61: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

IMAGE RETRIEVAL

Figure from Babenko et.al.2014

1. Pass image through ImageNet-pretrained CNN.2. Use some layer activations as description

3. L2-normalize (must!)

4. Put in some fast NN search like kd-tree.

Page 62: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

EMBEDDINGS WITH SIAMESE NETWORKS

1. Put 2 images through copies of the same networks2. L2 difference < 1 if same person, >1 if different

https://www.cs.nyu.edu/~sumit/research/research.html

Page 63: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

WHAT ABOUT 3 COPIES? TRIPLETS

http://arxiv.org/abs/1412.6622

1. Put 2 images through copies of the same networks

2. D(x, x+) < D (x,x-)

Drawback: 1) slow training :(

2) Have to select hard triplets. Random

ones easily satisfy equation above.

Page 64: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

GENERATING IMAGES WITH GANS

http://soumith.ch/eyescream/

http://arxiv.org/abs/1511.06434

• Generator tries to generate image undistinguishable from natural.• Discriminatior tries to distinguish.

• Both learn simultaneously

Page 65: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

AUTO-ENCODERS

http://deeplearning4j.org/deepautoencoder.html

Page 66: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

DE-NOISING AUTO-ENCODERS

Clean Input Corrupted input (what net sees) Reconstructed

If compare input and reconstruction, we can detect anomalies

http://www.cs.toronto.edu/~ranzato/research/projects.html

Page 67: INTRODUCTION TO DEEP LEARNING - cw.fel.cvut.cz · INTRODUCTION TO DEEP LEARNING Dmytro Mishkin Czech Technical University in Prague Clear Research Corporation ducha.aiki@gmail.com

QUESTIONS?