Presentation2 - Copy · Microsoft PowerPoint - Presentation2 - Copy.pptx Author: morteza Created...

Preview:

Citation preview

Convolutional and Recurrent Neural Networks

Part 2

Morteza ChehreghaniChalmers University of Technology

References

• The slides have been prepared based on• Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel S. Emer, Efficient Processing of Deep Neural

Networks: A Tutorial and Survey. Proceedings of the IEEE 105(12): 2295-2329 (2017)and the slides at: http://www.rle.mit.edu/eems/wp-content/uploads/2017/06/ISCA-2017-

Hardware-Architectures-for-DNN-Tutorial.pdf

• https://github.com/hunkim/PyTorchZeroToAll [for PyTorch]

The first DNN application

• Image Classification Task:• 1.2M training images• 1000 classes

• Object Detection Task:• 456k training images• 200 classes

The first DNN application

The first DNN application

LeNet-5

AlexNet

AlexNet Convolutional Layer Configurations

VGG-16CONV Layers: 13Fully Connected Layers: 3Weights: 138MMACs: 15.5G Also, 19 layer version

Image Source: http://www.cs.toronto.edu/~frossard/post/vgg16/

GoogLeNet (v1)

[Szegedy et al., arXiv 2014, CVPR 2015]

CONV Layers: 21 (depth), 57 (total)Fully Connected Layers: 1Weights: 7.0MMACs: 1.43G

Also, v2, v3 and v4ILSVRC14 Winner

GoogLeNet (v1)

CONV Layers: 21 (depth), 57 (total)Fully Connected Layers: 1Weights: 7.0MMACs: 1.43G

Also, v2, v3 and v4ILSVRC14 Winner

ResNet-50CONV Layers: 49Fully Connected Layers: 1Weights: 25.5MMACs: 3.9G

Also, 34,152 and 1202 layer versionsILSVRC15 Winner

[He et al., arXiv 2015, CVPR 2016]

Revolution of Depth

http://icml.cc/2016/tutorials/icml2016_tutorial_deep_residual_networks_kaiminghe.pdf

Summary of Popular DNNs

Image Classification Datasets

Image Classification/Recognition– Given an entire image -> Select 1 of N classes– No localization (detection)

Image Source: Stanford cs231n

MNIST

Digit Classification28x28 pixels (B&W)10 Classes60,000 Training10,000 Testing

LeNet in 1998(0.95% error)

ICML 2013(0.21% error)

http://yann.lecun.com/exdb/mnist/

CIFAR-10/CIFAR-100

Image Source: http://karpathy.github.io/

Object Classification

32x32 pixels (color)10 or 100 Classes50,000 Training10,000 Testing

ImageNet

Object Classification

~256x256 pixels (color)1000 Classes1.3M Training100,000 Testing (50,000 Validation)

http://www.image-net.org/challenges/LSVRC/

ImageNet

Fine grained Classes(120 breeds)

Winner 2012Top-5 Error (16.42% error)

Winner 2016Top-5 Error (2.99% error)

Image Classification Datasets - Summary

http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

Next Tasks: Localization and Detection

[Russakovsky et al., IJCV, 2015]

Other Popular Datasets

Pascal VOC– 11k images– Object Detection– 20 classeshttp://host.robots.ox.ac.uk/pascal/VOC/

• MS COCO– 300k images– Detection, Segmentation– Recognition in contexthttp://mscoco.org/

Recently Introduced Datasets

• Google Open Images (~9M images)• https://github.com/openimages/dataset

• Youtube-8M (8M videos)• https://research.google.com/youtube8m/

• AudioSet (2M sound clips)• https://research.google.com/audioset/index.html

Summary of Deep Learning for Images

Image Classification Object Localization Object Detection

• Image Segmentation• Action Recognition• Image Generation

Deep Learning for Speech Speech Recognition Natural Language Processing Speech Translation Audio Generation

Deep Learning on Games

Google DeepMind AlphaGo

Medical Applications of Deep Learning

Brain Cancer Detection Image Source: [Jermyn et al., JBO 2016]

Deep Learning for Self-driving Cars

Mature Applications

• Image• Classification: image to object class• Recognition: same as classification (except for faces)• Detection: assigning bounding boxes to objects• Segmentation: assigning object class to every pixel

• Speech & Language• Speech Recognition: audio to text• Translation• Natural Language Processing: text to meaning• Audio Generation: text to audio

• Games

Emerging Applications

• Medical (Cancer Detection, Pre-Natal)• Finance (Trading, Energy Forecasting, Risk)• Infrastructure (Structure Safety and Traffic)• Weather Forecasting and Event Detection

http://www.nextplatform.com/2016/09/14/next-wave-deep-learning-applications/

Opportunities

• $500B Market over 10 Years!

Frameworks

Benefits of Frameworks

• Rapid development• Sharing models• Workload profiling• Network hardware co-design

• PyTorch is a python package that provides two high-level features:• Tensor computation (like numpy) with strong GPU acceleration• Deep Neural Networks built on a tape-based autograd system

• http://pytorch.org/about/

• To learn basics: https://github.com/hunkim/PyTorchZeroToAll

Why

• More Pythonic (imperative)• Flexible• Intuitive and cleaner code• Easy to debug

• More Neural Networkic• Write code as the network works• forward/backward

STEPS

s