Introduction to Deep Learning · Neural network training: (Stochastic) Gradient Descent 20 1....

Preview:

Citation preview

Introduction to Deep Learning

Quan Geng

Columbia UniversityOctober 25, 2019

1

Outline

● Background of myself● Motivation: Success of Deep Learning● Basics of Deep Learning

○ Neural networks: Neuron, activation function○ Optimizers: (Stochastic) Gradient Descent ○ Backpropagation○ Convolutional Neural Network

● Applications of Deep Learning○ Personal Photo Search, Search Ranking, Smart Reply, ...

● Summary2

Reference

● Jeffrey Hinton, Yoshua Bengio, and Yann LeCun, Tutorial on Deep Learning

● Jeff Dean, Trends and Developments in Deep Learning Research● Jeff Dean, Large-Scale Deep Learning With TensorFlow● Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning

(textbook)

● Google, Machine Learning Crash Course

3

Background of myself

● 2013 PhD from ECE Dept., University of Illinois Urbana Champaign● 2014 - 2015 Quantitative Analyst, Tower Research, New York● 2015 - now Senior Software Engineer, Google Research, New York

● Homepage: https://dreaven.github.io/

4

Machine Learning Jobs

5

https://www.quanwei.tech/?job=machine+learning

Motivation: Success of Deep Learning

6

2018 ACM Turing Awards

7https://awards.acm.org/about/2018-turing

ACM named Yoshua Bengio, Geoffrey Hinton, and Yann LeCun recipients of the 2018 ACM A.M. Turing Award for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.

In recent years, deep learning methods have been responsible for astonishing breakthroughs in computer vision, speech recognition, natural language processing, and robotics.

First major success of Neural Networks: AlexNet

9

ImageNet● 15M labeled high-resolution images● 22,000 categories.

ImageNet Large Scale Visual Recognition Challenge (ILSVRC)● A subset of ImageNet● 1000 Images in each of 1000 Categories● 1.2M training images● 50K validation images● 150K testing images

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, 2012

Neural networks dominate ImageNet competitions

10

Trend of Deep Learning papers

11Evolution on the number of papers published on Deep Learning topics with respect to those on Deep Learning in BioInformatics (source ink) .

Deep Learning for High Frequency Trading

12http://www.hudson-trading.com/careers/job/?gh_jid=940856

Basics of Deep Learning

13

14https://cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf

15

16

Neuron in Human Brain

17https://training.seer.cancer.gov/anatomy/nervous/tissue.html

Artificial Neuron in Deep Learning

18

Activation functions introduces non-linearity in the model. Commonly used

● Sigmoid functions● Rectified linear unit (ReLU)

Feed-Forward Neural Networks (FFNN)

19Input layer

Hidden layersOutput layer

Neural network training: (Stochastic) Gradient Descent

20

● 1. Randomly initialize the weights in the neural networks.● 2. Given a batch or all input data, compute the predicted output.● 3. Compute the loss of the actual output and the predicted

output.● 4. Compute the gradient for each weight in the neural network.● 5. Update the weights based on the gradient.

Repeat from step 2 until convergence.

Neural network training: Backpropagation

21

4. Compute the gradient for each weight in the neural network.

video link

Neural network training: Optimizers for Gradient descent

22https://towardsdatascience.com/types-of-optimization-algorithms-used-in-neural-networks-and-ways-to-optimize-gradient-95ae5d39529f

Properties for Neural network training

23

Properties of Gradient Descent algorithms● Does not guarantee convergence to the global minima● Random initialization converges to different local minima● But performs very well in practice

Techniques to avoid overfitting● Weight regularization● Dropout● Early-stop

video link

Convolutional Neural Network

24

Convolutional Neural Network

25

Applications of Deep Learning

26

Deep Learning Frameworks

27https://medium.com/@NirantK/the-silent-rise-of-pytorch-ecosystem-693e74b33f1e

TensorFlow

28

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. Released by Google Brain in 2015.

TensorFlow Example: MNIST

29

Link

30

31

32

33

34

35

Summary

36

● Motivation: Success of Deep Learning● Basics of Deep Learning (DL)

○ Neural networks (NN): Neuron, activation function○ Optimizers: (Stochastic) Gradient Descent ○ Backpropagation○ Convolutional Neural Network

● Applications of Deep Learning○ TensorFlow○ Personal Photo Search, Search Ranking, Smart Reply and more

● Advanced topics (not covered in this lecture)○ recurrent neural networks○ sequence model○ word2vec (text embeddings)○ advanced optimizers○ autoencoder○ generative adversarial network

Thank you!

37

Recommended