Intro to Deep Learning w/ ClojureIt’s Difficult to Make Predictions.Especially About the Future.
@JulioBarrosConsultantE-String.com
@JulioBarros http://E-String.com 1
Hypothesis
Given that Python dominates ML/DL;Is there a valid use case for Clojure?TL;DR -- It depends
@JulioBarros http://E-String.com 2
I Hope So
... and I'm not the only one.
@JulioBarros http://E-String.com 3
Hype or Reality?
@JulioBarros http://E-String.com 4
"It is a renaissance, it is a golden age,"
"Machine learning and AI is a horizontal enabling layer. It will empower and improve every business, every government organization, every philanthropy — basically there's no institution in the world that cannot be improved with machine learning."— Bezos
@JulioBarros http://E-String.com 5
Every industry can expect to be transformed by Artificial Intelligence
@JulioBarros http://E-String.com 6
Image Classification
Justin Johnson, Andrej Karpathy, Li Fei-Fei - Stanford@JulioBarros http://E-String.com 7
My dog Rex
@JulioBarros http://E-String.com 8
Object Detection
@JulioBarros http://E-String.com 9
Image Captioning
@JulioBarros http://E-String.com 10
Dense Captioning
@JulioBarros http://E-String.com 11
Healthcare
Near or better than human level performance.
@JulioBarros http://E-String.com 12
Business
— Law & Finance— Text, audio, image, video understanding— Churn prediction, customer segmentation— Product recommendations— Manufacturing, maintenance and control— Many many more
@JulioBarros http://E-String.com 13
Artificial Intelligence
Artificial Intelligence (AI) - the study of "intelligent agents". Reasoning, knowledge representation, planning, robotics, etc.
— Artificial Narrow Intelligence (ANI)— Artificial General Intelligence (AGI)— Artificial Superintelligence (ASI)
@JulioBarros http://E-String.com 14
Machine Learning
Machine Learning (ML) - Programs that learn from the data and make predictions.
@JulioBarros http://E-String.com 15
Types of prediction
Regression - continuous values
Classification - discreet values/labels
@JulioBarros http://E-String.com 16
Types of ML Algorithms
Supervised - trained on labeled data (regression or classification).
Unsupervised - trained on unlabeled data (clustering, segmentation).
Reinforcement - learns based on outcomes/results of actions.
@JulioBarros http://E-String.com 17
Deep Learning
Deep Learning (DL) - ML/AI using artificial neural networks (ANNs)
You might be thinking ...
What? Isn't that just a neural net? Didn't we show those don't work. Twice.
@JulioBarros http://E-String.com 18
Third time is a charm
Capabilities of neural nets have changed dramatically due to advancements in:
— Data
— Algorithms
— Hardware
@JulioBarros http://E-String.com 19
Neurons: Biologically inspired
1942 McCulloch and Pitts1957 Rosenblatt
(A (+ b (apply + (map * x w)))
@JulioBarros http://E-String.com 20
Activation Function
Introduces non linearity
— Historicaly: Identity, Step, Tanh, Sigmoid
— Currently: Rectified linear Unit (relu), Softmax
@JulioBarros http://E-String.com 21
Universal Approximation Theorem (1989)
... a feed-forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions ...
@JulioBarros http://E-String.com 22
Deep Neural Nets
A net with more than one hidden layer.
@JulioBarros http://E-String.com 23
VGG16
@JulioBarros http://E-String.com 24
GoogleLeNet
@JulioBarros http://E-String.com 25
Training
1) Initialize weights randomly
2) Make prediction
3) Measure error (loss)
4) Adjust weights in the right direction
5) GOTO 2
Repeat over and over and over again with your training data
@JulioBarros http://E-String.com 26
Backpropagation
Finding the right direction.
@JulioBarros http://E-String.com 27
Backpropagation
To know the right direction calculate the gradient of the loss function with respect to each weight. Multiply by the error and a learning rate to get the delta.
Don't worry. The libraries do it for you.
@JulioBarros http://E-String.com 28
Millions of Knobs Parameters
@JulioBarros http://E-String.com 29
A Simple Neural Net in Cortex
(def network-architecture [(layers/input 11 1 1 :id :x)
(layers/linear 64) (layers/relu)
(layers/linear 1 :id :y)])
@JulioBarros http://E-String.com 30
Wine Quality Data
11 features, 1 target column
@JulioBarros http://E-String.com 31
UCI Machine Learning Repository
https://archive.ics.uci.edu/ml/datasets/Wine+Quality
Source: Paulo Cortez, University of Minho, Guimarães, Portugal, http://www3.dsi.uminho.pt/pcortez A. Cerdeira, F. Almeida, T. Matos and J. Reis, Viticulture Commission of the Vinho Verde Region(CVRVV), Porto, Portugal @2009
@JulioBarros http://E-String.com 32
Normalized / Scaled data (Standardized)
@JulioBarros http://E-String.com 33
Demo
https://github.com/thinktopic/cortexhttps://github.com/JulioBarros/clj-dl-demo
@JulioBarros http://E-String.com 34
How Do We Work With Images
Well, images are just numbers/data.
Though numbers close to each other are more related.@JulioBarros http://E-String.com 35
Convolutional Layers
Similar to correlations from signal processing or filters from photoshop.
A small NxN filter is slid over and convolved/correlated with the image. Learns to find features.
Then lower level features are combined into higher level features.
@JulioBarros http://E-String.com 36
Common Types of Layers
— input / output— fully connected (dense)— activation - relu, softmax— convolutional— maxpool— flatten— drop out
@JulioBarros http://E-String.com 37
Types of ANN
1. Dense Neural Net (DNN)2. Convolutional Neural Net (CNN)3. Recurrent Neural Net (RNN)4. Everything else
@JulioBarros http://E-String.com 38
Programming Abstractions
Level Python CLJ/JVM
DSL/API Keras, Lasagne, TF-Slim Cortex, ???
Computation Graph, Backprop, Autograd
Tensorflow, Theano, Torch, Pytorch, Caffe, MXNet, CNTK
Cortex, dl4j, Java bindings
Matrix Math CUDA (cuDNN), Eigen3 Neandertal, core.matrix, vectorz-clj
@JulioBarros http://E-String.com 39
Production Considerations
— Target hardware environment— GPU(s)— Powerful multicore server— Mobile device— Embedded
— Desired latency / scalability— Data pipeline— Software engineering practices@JulioBarros http://E-String.com 40
Don't Underestimate the Last Two
We need to consider running and maintaining the models in production.
@JulioBarros http://E-String.com 41
Approaches
— Python based environment— Clojure based environment— Generate C++ binaries— API calls to third party— API calls to microservices
@JulioBarros http://E-String.com 42
Challenges with DL
— Needs lots of data. Labeled data is expensive.— Lacks explain-ability— Performance requirements - training and inference— Max performance unclear— Best architecture unclear
@JulioBarros http://E-String.com 43
Benefits of DL
— Handles much of the feature engineering
— Handles complex (non linear) problems
— Advancements coming quickly
@JulioBarros http://E-String.com 44
Recommendations
Do not be intimidated by ANNs or the math.
Start with Keras (and Tensorflow or Theano) tutorials (or maybe Pytorch).
Later choose language/framework as needs dictate.
@JulioBarros http://E-String.com 45
Resources
Andrew Ng's Coursera Course and Fast.AI Mooc
Deep Learning Book - Goodfellow, Bengio and Courville
Meetups- Portland-Data-Science-Group- Portland-Machine-Learning-Meetup- Portland-Deep-Learning
@JulioBarros http://E-String.com 46
Thank you! Questions?
[email protected] @JulioBarros@JulioBarros http://E-String.com 47