Introduction to Deep Learning · Neural network training: (Stochastic) Gradient Descent 20 1....

Introduction to Deep Learning

Quan Geng

Columbia UniversityOctober 25, 2019

Outline

● Background of myself● Motivation: Success of Deep Learning● Basics of Deep Learning

○ Neural networks: Neuron, activation function○ Optimizers: (Stochastic) Gradient Descent ○ Backpropagation○ Convolutional Neural Network

● Applications of Deep Learning○ Personal Photo Search, Search Ranking, Smart Reply, ...

● Summary2

Reference

● Jeffrey Hinton, Yoshua Bengio, and Yann LeCun, Tutorial on Deep Learning

● Jeff Dean, Trends and Developments in Deep Learning Research● Jeff Dean, Large-Scale Deep Learning With TensorFlow● Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning

(textbook)

● Google, Machine Learning Crash Course

Background of myself

● 2013 PhD from ECE Dept., University of Illinois Urbana Champaign● 2014 - 2015 Quantitative Analyst, Tower Research, New York● 2015 - now Senior Software Engineer, Google Research, New York

● Homepage: https://dreaven.github.io/

Machine Learning Jobs

https://www.quanwei.tech/?job=machine+learning

Motivation: Success of Deep Learning

2018 ACM Turing Awards

7https://awards.acm.org/about/2018-turing

ACM named Yoshua Bengio, Geoffrey Hinton, and Yann LeCun recipients of the 2018 ACM A.M. Turing Award for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.

In recent years, deep learning methods have been responsible for astonishing breakthroughs in computer vision, speech recognition, natural language processing, and robotics.

A brief history

8Image credit

First major success of Neural Networks: AlexNet

ImageNet● 15M labeled high-resolution images● 22,000 categories.

ImageNet Large Scale Visual Recognition Challenge (ILSVRC)● A subset of ImageNet● 1000 Images in each of 1000 Categories● 1.2M training images● 50K validation images● 150K testing images

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, 2012

Neural networks dominate ImageNet competitions

Trend of Deep Learning papers

11Evolution on the number of papers published on Deep Learning topics with respect to those on Deep Learning in BioInformatics (source ink) .

Deep Learning for High Frequency Trading

12http://www.hudson-trading.com/careers/job/?gh_jid=940856

Basics of Deep Learning

14https://cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf

Neuron in Human Brain

17https://training.seer.cancer.gov/anatomy/nervous/tissue.html

Artificial Neuron in Deep Learning

Activation functions introduces non-linearity in the model. Commonly used

● Sigmoid functions● Rectified linear unit (ReLU)

Feed-Forward Neural Networks (FFNN)

19Input layer

Hidden layersOutput layer

Neural network training: (Stochastic) Gradient Descent

● 1. Randomly initialize the weights in the neural networks.● 2. Given a batch or all input data, compute the predicted output.● 3. Compute the loss of the actual output and the predicted

output.● 4. Compute the gradient for each weight in the neural network.● 5. Update the weights based on the gradient.

Repeat from step 2 until convergence.

Neural network training: Backpropagation

4. Compute the gradient for each weight in the neural network.

video link

Neural network training: Optimizers for Gradient descent

22https://towardsdatascience.com/types-of-optimization-algorithms-used-in-neural-networks-and-ways-to-optimize-gradient-95ae5d39529f

Properties for Neural network training

Properties of Gradient Descent algorithms● Does not guarantee convergence to the global minima● Random initialization converges to different local minima● But performs very well in practice

Techniques to avoid overfitting● Weight regularization● Dropout● Early-stop

video link

Convolutional Neural Network

Applications of Deep Learning

Deep Learning Frameworks

27https://medium.com/@NirantK/the-silent-rise-of-pytorch-ecosystem-693e74b33f1e

TensorFlow

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. Released by Google Brain in 2015.

TensorFlow Example: MNIST

Summary

● Motivation: Success of Deep Learning● Basics of Deep Learning (DL)

○ Neural networks (NN): Neuron, activation function○ Optimizers: (Stochastic) Gradient Descent ○ Backpropagation○ Convolutional Neural Network

● Applications of Deep Learning○ TensorFlow○ Personal Photo Search, Search Ranking, Smart Reply and more

● Advanced topics (not covered in this lecture)○ recurrent neural networks○ sequence model○ word2vec (text embeddings)○ advanced optimizers○ autoencoder○ generative adversarial network

Thank you!

Introduction to Deep Learning · Neural network training: (Stochastic) Gradient Descent 20 1....

Documents

2010 BMW320i Initialize Rain/sun Sensor

Machine Learning with Chaotic Recurrent Neural Networks · 1.1 Recurrent Neural Networks In this project, we are using a generic network of N neurons who are sparsely randomly recur-rently

Arrays One-Dimensional initialize & display Arrays as Arguments Two-dimensional initialize & display Part I

Papyrus User Guide - Eclipsepedia - unipr.it · · 2014-09-01Initialize Sequence Diagram Initialize Deployment Diagram ... Structures Diagram Initialize Component Diagram Initialize

Power Flow Record Structures to Initialize OpenIPSL Phasor

New targeted therapies for malignant neural tumorsuu.diva-portal.org/smash/get/diva2:1386670/FULLTEXT01.pdf · 2020-02-05 · pear randomly during cell division, as the number of

Analysis(of(Social(Mediawcohen/10-802/10-16-prob-graphs.pdfReview(4(LDA(• LatentDirichletAllocaon(z w β M θ N α • Randomly initialize each z m,n • Repeat for t=1,…. •

Introduction to Neural Networks I June 23 Kashinath… · Neural networks weights have to be randomly initialized to break the symmetry. Normal distribution initialization with a

Structure from motionlazebnik/spring11/lec17_sfm.pdftriangulation . Sequential structure from motion •Initialize motion from two images using fundamental matrix •Initialize structure

Exploring Randomly Wired Neural Networks for Image …

CSC321 Introduction to Neural Networks and Machine Learning Lecture 21 Using Boltzmann machines to initialize backpropagation

Exploring Randomly Wired Neural Networks for Image Recognition · network that learns features. Rather than focusing primar-ily on search with a ﬁxed generator, we suggest designing

RTEMS C User’s Guide...ii RTEMS C User’s Guide 4.4.2 INITIALIZE BEFORE DRIVERS - Perform Initialization Before Device Drivers::::: 27 4.4.3 INITIALIZE DEVICE DRIVERS - Initialize

Neurorehabilitation and Neural Repairwkdurfee/publications/NNR_2007.pdfNeurorehabilitation and Neural Repair 21(3); 2007 217 METHODS Subjects Twenty subjects with stroke were randomly

Typical Program Structure: Initialize Window (GLUT) Initialize Event Handlers and Menus Set some initial GL states/values Pass control over to OpenGL

Classification with Costly Featuresusing Deep ... · Classification with Costly Features using Deep Reinforcement Learning Arxiv preprint, 2017 Algorithm 1 Training Randomly initialize

Gradient Descent - Towards Neural Networks · 2019. 4. 14. · Epoch Foreachbatchofsizem trainingimages, Foreachi inrange{1,2,3,···,m} Randomly initialized a 784 × 15 weight matrix

Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Acceleratorsvlsiarch.eecs.harvard.edu/wp-content/uploads/2016/05/... · 2019-03-21 · for low-voltage SRAMs, to randomly

A Survey of Recent Advances in Transmission Network Switching · Binary Particle Swarm Optimization Slide 19 Algorithm: 1. Initialize Create a population of randomly configured particles,

Fun With Randomly