1. Learn to Build an App to Find Similar Images using Deep
Learning Piotr Teterwak Dato, Machine Learning Engineer
2. 2 First things first, installation http://bit.ly/dato_pydata
import graphlab sf = graphlab.Sframe() sa =
graphlab.Sarray(range(100)).apply(lambda x: x)
3. 3 Who is Dato?
4. Graphlab Create: Production ML Pipeline DATA
YourWebServiceor IntelligentApp ML Algorithm Data cleaning &
feature eng Offline eval & Parameter search Deploy model Data
engineering Data intelligence Deployment Goal: Platform to help
implement, manage, optimize entire pipeline
5. Deep Learning
6. Todays talk
7. Todays tutorial Deep Learning Overview Training a model on
MNIST Building a Similar Dress Recommender Deploying the Dress
Recommender
8. Features are key to machine learning
9. 9 Simple example: Spam filtering A user opens an email -
Will she thinks its spam? Whats the probability email is spam? Text
of email User info Source info Input: x MODEL Yes! No Output:
Probability of y
10. 10 Feature engineering: the painful black art of
transforming raw inputs into useful inputs for ML algorithm E.g.,
important words, complex transformation of input, MODEL Yes! No
Output: Probability of y Feature extraction Features: (x) Text of
email User info Source info Input: x
11. Deep Learning for Learning Features
12. 12 Linear classifiers Most common classifier - Logistic
regression - SVMs - Decision correspond to hyperplane: - Line in
high dimensional space w0 + w1 x1 + w2 x2 > 0 w0 + w1 x1 + w2 x2
< 0
13. 13 What can a simple linear classifier represent? AND 0 0 1
1
14. 14 What can a simple linear classifier represent? OR 0 0 1
1
15. 15 What cant a simple linear classifier represent? XOR 0 0
1 1 Need non-linear features
16. 16 Non-linear feature embedding 0 0 1 1
17. 17 Graph representation of classifier: Useful for defining
neural networks x 1 x 2 x d y 1 w2 w0 + w1 x1 + w2 x2 + + wd xd
> 0, output 1 < 0, output 0 Input Output
18. 18 What can a linear classifier represent? x1 OR x2 x1 AND
x2 x 1 x 2 1 y -0.5 1 1 x 1 x 2 1 y -1.5 1 1
19. Solving the XOR problem: Adding a layer XOR = x1 AND NOT x2
OR NOT x1 AND x2 z 1 -0.5 1 -1 z1 z2 z 2 -0.5 -1 1 x 1 x 2 1 y 1
-0.5 1 1 Thresholded to 0 or 1
20. 20
http://deeplearning.stanford.edu/wiki/images/4/40/Network3322.png
Deep Neural Networks P(cat|x) P(dog|x)
21. 21 Deep Neural Networks Can model any function with enough
hidden units. This is tremendously powerful: given enough units, it
is possible to train a neural network to solve arbitrarily
difficult problems. But also very difficult to train, too many
parameters means too much memory+computation time.
22. 22 Neural Nets and GPUs Many operations in Neural Net
training can happen in parallel Reduces to matrix operations, many
of which can be easily parallelized on a GPU.
34. 35 Image features Features = local detectors - Combined to
make prediction - (in reality, features are more low-level) Face!
Eye Eye Nose Mouth
35. 36 Standard image classification approach Input
Computer$vision$features$ SIFT$ Spin$image$ HoG$ RIFT$ Textons$
GLOH$ Slide$Credit:$Honglak$Lee$ Extract features Use simple
classifier e.g., logistic regression, SVMs Face
36. 37 Many hand crafted features exist
Computer$vision$features$ SIFT$ Spin$image$ HoG$ RIFT$ Textons$
GLOH$ Slide$Credit:$Honglak$Lee$ but very painful to design
37. 38 Change image classification approach? Input
Computer$vision$features$ SIFT$ Spin$image$ HoG$ RIFT$ Textons$
GLOH$ Slide$Credit:$Honglak$Lee$ Extract features Use simple
classifier e.g., logistic regression, SVMs FaceCan we learn
features from data?
38. 39 Use neural network to learn features Input Learned
hierarchy Output Lee et al. Convolutional Deep Belief Networks for
Scalable Unsupervised Learning of Hierarchical Representations ICML
2009
39. Sample results Traffic sign recognition (GTSRB) - 99.2%
accuracy House number recognition (Google) - 94.3% accuracy 40
40. Krizhevsky et al. 12: 60M parameters, won 2012 ImageNet
competition 41
42. 43 Application to scene parsing Carlos Guestrin 2005-2014 Y
LeCun MA Ranzato Semantic Labeling: Labeling every pixel with the
object it belongs to [ Farabet et al. ICML 2012, PAMI 2013] Would
help identify obstacles, targets, landing sites, dangerous areas
Would help line up depth map with edge maps
43. Challenges of deep learning
44. Deep learning score card Pros Enables learning of features
rather than hand tuning Impressive performance gains on - Computer
vision - Speech recognition - Some text analysis Potential for much
more impact Cons
45. Deep learning workflow Lots of labeled data Training set
Validation set 80% 20% Learn deep neural net model Validate
46. Deep learning score card Pros Enables learning of features
rather than hand tuning Impressive performance gains on - Computer
vision - Speech recognition - Some text analysis Potential for much
more impact Cons Computationally really expensive Requires a lot of
data for high accuracy Extremely hard to tune - Choice of
architecture - Parameter types - Hyperparameters - Learning
algorithm - Computational + so many choices = incredibly hard to
tune
47. 48 Can we do better? Input Learned hierarchy Output Lee et
al. Convolutional Deep Belief Networks for Scalable Unsupervised
Learning of Hierarchical Representations ICML 2009
48. Deep features: Deep learning + Transfer learning
49. 50 Transfer learning: Use data from one domain to help
learn on another Lots of data: Learn neural net Great accuracy Some
data: Neural net as feature extractor + Simple classifier Great
accuracy on new problem Old idea, explored for deep learning by
Donahue et al. 14
50. 51 Whats learned in a neural net Neural net trained for
Task 1 Very specific to Task 1More generic Can be used as feature
extractor vs.
51. 52 Transfer learning in more detail Neural net trained for
Task 1 Very specific to Task 1More generic Can be used as feature
extractor Keep weights fixed! For Task 2, learn only end part Use
simple classifier e.g., logistic regression, SVMs Class?
52. 53 Using ImageNet-trained network as extractor for general
features Using classic AlexNet architechture pioneered by Alex
Krizhevsky et. al in ImageNet Classification with Deep
Convolutional Neural Networks It turns out that a neural network
trained on ~1 million images of about 1000 classes makes a
surprisingly general feature extractor First illustrated by Donahue
et al in DeCAF: A Deep Convolutional Activation Feature for Generic
Visual Recognition 53
53. Transfer learning with deep features Training set
Validation set 80% 20% Learn simple model Some labeled data Extract
features with neural net trained on different task Validate Deploy
in production
54. In real life
55. What else can we do with Deep Features? 56
56. 57 Clustering images Goldberger et al. Set of Images
57. Finding similar images 58
58. 59 Finding similar images A B C A B C
59. Deploying ML in production
60. 67 Deployment system Model Historical Data Predictions
Batch training Real-time predictions REST Client Model Mgmt Dato
Predictive Services Robust, Elastic Direct Train deep model REST
API for predictions for new images, text,
61. Summary
62. Deep learning made easy with deep features Deep learning:
exciting ML development Slow, lots of tuning, needs lots of data
Can still achieve excellent performance Deep features: reuse deep
models for new domains Needs less data Faster training times Much
simpler tuning
63. 70 Next Build a Deep Learning model with GLC Creating an
Image Similarity Service Deploying with Predictive Services
64. 71 First things first, installation
http://bit.ly/dato_pydata import graphlab sf = graphlab.Sframe() sa
= graphlab.Sarray(range(100)).apply(lambda x: x)