60
An introduction to Deep Learning

An introduction to Deep Learning

Embed Size (px)

Citation preview

Page 1: An introduction to Deep Learning

An introduction to Deep Learning

Page 2: An introduction to Deep Learning

Who am I?

• David Rostcheck

• I am a data science consultant

• Follow my articles on LinkedIn

Page 3: An introduction to Deep Learning

DEEP LEARNING

Page 4: An introduction to Deep Learning

in some tests, Deep Learning has already shown abilities at the same level as humans

Page 5: An introduction to Deep Learning

These include: • computers that understand natural

language• autonomous vehicles • programs that can identify what is

occurring in a video

Page 6: An introduction to Deep Learning

It’s notable that

these solutions to diverse problems

in very different fields

use the same powerful technology

Page 7: An introduction to Deep Learning

NEURAL NET

Page 8: An introduction to Deep Learning

a neural net is a

simulation

of the brain,

a mathematical abstraction

Page 9: An introduction to Deep Learning

in the real brain,

the neurons send signals withfre cuen cies

not discrete signals

Page 10: An introduction to Deep Learning

tools exist that try to simulate the brain in a way that’s

more accurate

to the real brain

Page 11: An introduction to Deep Learning

Example: Numenta NuPIC, a type of Hierarchical Temporal Memory (HTM)

Page 12: An introduction to Deep Learning

but the techniques of neural nets

are sufficient

to deliver results

similar or better than humans

in specific cognitive tests

Page 13: An introduction to Deep Learning

therefore:

Deep Learning

what is it?

Page 14: An introduction to Deep Learning

common point of view:

a with

neural distinct

net levels

is correct, but…

Page 15: An introduction to Deep Learning

there is another point of view,

maybe more useful,

that we are going to present here

Page 16: An introduction to Deep Learning

it comes from Vincent Vanhoucke, Principal Research Scientist at Google.

the following comes from

his course on Deep

Learning, on Udacity

Page 17: An introduction to Deep Learning

He thinks about Deep Learning as

a framework for calculating

linear and almost linear

equations in an efficient way

Page 18: An introduction to Deep Learning

to develop this framework,

we are going to construct a

classifier

the simplest (and worst)

possible

Page 19: An introduction to Deep Learning

but wait a minute…

why

a classifier?

Page 20: An introduction to Deep Learning

Because classification (or more generally prediction) is a central technique in Machine Learning

with this, we can achieve ranking, regression, detection, reinforcement learning, and more…

Page 21: An introduction to Deep Learning

we start with a linear equation, in vector form…

Page 22: An introduction to Deep Learning

Think about constructing a simple classifier to predict, for each occurrence of X, which is:

Page 23: An introduction to Deep Learning

to do this, we must learn the values of W and b

Page 24: An introduction to Deep Learning

Does it work well?

Page 25: An introduction to Deep Learning

No.

It’s the worst.

Page 26: An introduction to Deep Learning

Why?

Page 27: An introduction to Deep Learning

there are two problems…

Page 28: An introduction to Deep Learning

No. 1:

it gives values,

and what we want

are probabilities

Page 29: An introduction to Deep Learning

we can fix it with the“softmax” function:

Page 30: An introduction to Deep Learning

we express the correct values in a vector of values 1 (correct) and 0 (the others).

we call this“one-hot encoding”

Page 31: An introduction to Deep Learning

to evaluate errors, we compare the probabilities with the correct values

Page 32: An introduction to Deep Learning

using what we call“cross-entropy”

Page 33: An introduction to Deep Learning

better, but…

there remains the second problem:

our equation is linear

and doesn’t represent non-linear equations well

Page 34: An introduction to Deep Learning
Page 35: An introduction to Deep Learning

this problem killed the perceptron (single level neural net)

Page 36: An introduction to Deep Learning

it doesn’t help to just add levels to the network

because we can represent whatever combination of linear operations as another linear operation – we can reduce the new network to another WX + b with the same problem

Page 37: An introduction to Deep Learning

What do we do?

Page 38: An introduction to Deep Learning

without another option,

we have to introduce non-linear

functions

logistic function

Page 39: An introduction to Deep Learning

but it’s expensive to calculate – we can use a simplified approximation called a “Rectified Linear Unit” , o ReLU

Page 40: An introduction to Deep Learning

now we can construct our neural net, in a way that’s efficient to calculate

Page 41: An introduction to Deep Learning

we can express this in a modular way, with a series of linear or almost linear operations with a matrix ... that allows us to us the power of a GPU

Page 42: An introduction to Deep Learning

this is good, but we are still lacking something…

to improve our estimation, we must minimize the error,

and this requires us to calculate the derivative of the function

Page 43: An introduction to Deep Learning

think about the chain rule of calculus:

d f(x) = d du f(x)dx du dx

Page 44: An introduction to Deep Learning

that can convert a derivative into a product (of other derivatives):

Page 45: An introduction to Deep Learning

that fits in our modular framework

Page 46: An introduction to Deep Learning

now we have it! a general, modular framework that incorporates everything we need!

Page 47: An introduction to Deep Learning

and we can construct deep neural nets, adding more levels as we need them

…but wait a minute:

why do we like deep networks?

Page 48: An introduction to Deep Learning

the most interesting problems,

like language and vision,

have very complex rules

we need a lot of parameters to represent them

Page 49: An introduction to Deep Learning

yes, but why don’t we use wider networks?

why is it better to have deep ones?

Page 50: An introduction to Deep Learning

are more efficient and better capture the structure inherent in many problems

Page 51: An introduction to Deep Learning

CONVNETS

Page 52: An introduction to Deep Learning

the convolutional network, or convnet,

transforms the input

so that the translation

of the input does not matter

we use it for visual recognition

Page 53: An introduction to Deep Learning

Let’s start with a photo:

Page 54: An introduction to Deep Learning

We use a region (kernel) of a photo like an input to another small neural net, with K as the output

Page 55: An introduction to Deep Learning

we slice the window across the photo

Page 56: An introduction to Deep Learning

this transforms the photo into another new one, with K color channels, and different dimensions

Page 57: An introduction to Deep Learning

this operation is called

a convolution

Page 58: An introduction to Deep Learning

if the region (the “kernel”) has

the same size as the original,

what did we obtain?

?

Page 59: An introduction to Deep Learning

in this case,

we recover the original photo

Page 60: An introduction to Deep Learning

Questions?

?Contact: [email protected], twitter: @davidrostcheckArticles: http://linkedin.com/in/davidrostcheck