Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Deep Learning in a Nutshell
Stefan CarlssonComputer Vision Group KTH
Machine Learning
x = data ---> y = label
y = f(x)
For example: y = ax + b
y = ax + b
x
y
x
xx
xx
xx
xx
x
xx
y = ax + b
x
y
x
xx
xx
x x
xx
x
x x
(yi - axi - b)
x
y
x
xx
xx
x x
xx
x
x x
2
i
mina, b
(yi - axi - b)
x
y
x
xx
ox
o x
xx
x
x x
2
i=1
mina, b
i=2
Use training data only, to find a and b
x1 x2
y1
y2
Supervised learning
1. Data set to be classified: x1 . . . xn2. Select subset for training: xi . . . xi3. Label training subset: yi . . . yi4. Find model parameters: a, b
(yi - axi - b)
2
i=i
mina, b
i=i
1
1
1
k
k
k
Vectors, classification
Data sets are in general vectors: x = (x1 … xm)
Linear classifier: y = wi xi = wTx
Non linearity: y* = f(y)i
y*
y
y = w1x1 + w2x
2 + w3
x1
x2
xx
x xx
xx
x x
x
x
x
x
x
x
x
x
xxx
x
x
x
x
y*
y1
-1
y* = 1
y* = -1
x1
x2
xx
x xx
xx
x x
x
x
x
x
x
x
x
x
xx x
x
x
x
x
x
x
Not linearly separable
x1
x2
xx
x xx
xx
x x
x
x
x
x
x
x
x
x
xx x
x
x
x
x
x
x
Non - linear classifier
x1
x2
xx
x xx
xx
x x
x
x
x
x
x
x
x
x
xx x
x
x
x
x
x
x
or, a non-linear coordinate transformation
y1
y2
y1 = f1(x1, x2)
y2 = f2(x1, x2)
y1 = wa1x
1 + wa2x
2 + wa3
y2 = wb1x
1 + wb2x
2 + wb3
x1
x2
xx
x xx
xx
x x
x
x
x
x
x
x
x
x
xx x
x
x
x
x
y*
y1
-1
y = wcy*1 + wcy*2
x
x
Hierarchies of linear transformations + non-linearity
y* = c
y = wcy*1 + wcy*2y1 = wa1x
1 + wa2x
2 + wa3
y2 = wb1x
1 + wb2x
2 + wb3
x1 x2
y1 y2
y*
y*
y1
-1
Hierarchical nets
Deep learning
Traditional learning:
1. Feature engineering of raw data2. Non-linear classifier design
Deep learning:
Learn the parameters of a hierachical network with raw data as input
Deep learning makes no distinction between feature selection and classifier
design
x2
y*
y*
Hierarchical Convolutional Nets
x3 x4
y* y*
y*
x5 x6
y* y*
y*
x7
y*
y*y*
y* y* y*y*Fully connected layer
Convolutional layer 1
Convolutional layer 2
Image Classification
Imagenet Dataset
15 million images annotated with content
Training convolutional nets with imagenet data
. . .1.2 million
imagesof 1000 classes
1000 classesoutput
Convolutional layersFully
connected layers
60 million parameters for training
2012
Previously: Feature engineering + non linear classifier
Training a classifier for each class using negative examples
Non-linear classifier
Object detection until 2012
A revolution in computer vision
Input image features for early nodes
1st level filters
2nd level generic “filters”
Input image features higher level nodes
Deep learning success
● image recognition● speech recognition● automatic translation● natural language understanding● bioinformatics
IBM acquires AlchemyAPI to power up Watson’s deep learning skills
Bioinformatics
Deep Learning as a generic tool
complex system
input output
Deep Learning as a generic tool
Human knowledge
input output
Deep Learning as a generic tool
Human knowledge
Natural language
Understanding
Deep Learning as a generic tool
● Lab tests
● Wearable monitoring
● Dialogue
● Personal genome
Medicalstate
Medical knowledge
Summary● Deep learning in hierarchical nets is a powerful way of
finding mappings between raw input data and labeled output data that generalizes outside the given datasets used for training
Summary● Deep learning in hierarchical nets is a powerful way of
finding mappings between raw input data and labeled output data that generalizes outside the given datasets used for training
● It is able to identify the structure and regularity of complex mappings in very different domains of application
Summary● Deep learning in hierarchical nets is a powerful way of
finding mappings between raw input data and labeled output data that generalizes outside the given datasets used for training
● It is able to identify the structure and regularity of complex mappings in very different domains of application
● The datasets used for training are the crucial components in this while the software is generic