Upload
buinguyet
View
227
Download
0
Embed Size (px)
Citation preview
2
GPU Computing
NVIDIAComputing for the Most Demanding Users
Computing Human Imagination
Computing Human Intelligence
3
DEEP LEARNING —A NEW COMPUTING MODEL
“Software that writes software”
“little girl is eating
piece of cake"
LEARNING
ALGORITHM
“millions of trillions
of FLOPS”
4
72%
74%
84%
88%
93%
96%
2010 2011 2012 2013 2014 2015
“SUPERHUMAN” RESULTSSPARK HYPERSCALE ADOPTION
Deep Learning
ImageNet — Accuracy %
Cloud Services with AI Powered by NVIDIA
Alibaba/Aliyun Amazon Baidu eBay Facebook
Flickr Google iFLYTEK iQIYI JD.com
Orange Periscope Pinterest Qihoo 360 Shazam
Skype Sogou Twitter Yahoo Supermarket Yandex YelpHand-coded CV
Human
74%76%
5
NVIDIA’S GPU EDUCATORS PROGRAM
The Flagship Offering: GPU Teaching Kits - Breaking the barriers of GPU education in academia:
Lecture slidesLecture videosHands-on labs/solutionsLarger coding projects/solutionsQuiz/exam questions/solutionsText and e-books
Different kits for different coursesAccelerated/Parallel Computing (available now!)Robotics (available now!)Machine/Deep Learning (coming soon!) Computer Vision, Computer Architecture, Computational Domain Sciences, Mathematics, etc. (future)
Get started today! developer.nvidia.com/educators
Advancing STEM Education with Accelerated Computing
6
TESLA ACCELERATED COMPUTING PLATFORMFocused on Co-Design for Accelerated Data Center
ProductiveProgrammingModel & Tools
Expert Co-Design
Accessibility
APPLICATION
MIDDLEWARE
SYS SW
LARGE SYSTEMS
PROCESSOR
Fast GPUEngineered for High Throughput
0,0
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
4,5
5,0
5,5
2008 2010 2012 2014 2016
NVIDIA GPU x86 CPUTFLOPS
M2090
M1060
K20
K80
Fast GPU+
Strong CPU
P100
7
NVIDIA DEEP LEARNING SDKHigh Performance GPU-Acceleration for Deep Learning
COMPUTER VISION SPEECH AND AUDIO BEHAVIOR
Object Detection Voice Recognition TranslationRecommendation
EnginesSentiment Analysis
DEEP LEARNING
cuDNN
MATH LIBRARIES
cuBLAS cuSPARSE
MULTI-GPU
NCCL
cuFFT
Mocha.jl
Image Classification
DEEP LEARNING
SDK
FRAMEWORKS
APPLICATIONS
8
“Horus can process and
identify obstacles 48 times faster than would be possible with
CPUs.”
-Saverio Murgia, Horus CEO and co-founder
9
NVIDIA DGX-1AI Supercomputer-in-a-Box
170 TFLOPS | 8x Tesla P100 16GB | NVLink Hybrid Cube Mesh
2x Xeon | 8 TB RAID 0 | Quad IB 100Gbps, Dual 10GbE | 3U — 3200W
10NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
INTRODUCING TESLA P100New GPU Architecture to Enable the World’s Fastest Compute Node
Pascal Architecture NVLink CoWoS HBM2 Page Migration Engine
Highest Compute Performance GPU Interconnect for Maximum Scalability
Unifying Compute & Memory in Single Package
Simple Parallel Programming with Virtually Unlimited Memory Space
Unified Memory
CPU
Tesla P100
11
TESLA P100 ACCELERATOR
Compute 5.3 TF DP ∙ 10.6 TF SP ∙ 21.2 TF HP
Memory HBM2: 720 GB/s ∙ 16 GB
Interconnect NVLink (up to 8 way) + PCIe Gen3
ProgrammabilityPage Migration Engine
Unified Memory
Availability DGX-1: Order Now
13
NVLINK ENABLES LINEAR MULTI-GPU SCALING
1,0x
2,0x
3,0x
4,0x
5,0x
6,0x
7,0x
8,0x
1GPU 2GPU 4GPU 8GPU
AlexnetOWT
DGX-1
P100 PCIE
Deepmark test with NVCaffe. AlexnetOWT use batch 128, Incep-v3/ResNet-50 use batch 32, weak scaling,
P100 and DGX-1 are measured, FP32 training, software optimization in progress, CUDA8/cuDNN5.1, Ubuntu 14.04
1,0x
2,0x
3,0x
4,0x
5,0x
6,0x
7,0x
8,0x
1GPU 2GPU 4GPU 8GPU
Incep-v3
DGX-1
P100 PCIE
1,0x
2,0x
3,0x
4,0x
5,0x
6,0x
7,0x
8,0x
1GPU 2GPU 4GPU 8GPU
ResNet-50
DGX-1
P100 PCIE
Speedup
2.3x
1.3x
1.5x
14
Instant productivity — plug-and-play, supporting every AI framework
Performance optimized across the entire stack
Always up-to-date via the cloud
Mixed framework environments —virtualized and containerized
Direct access to NVIDIA experts
DGX-1 STACKFully integrated Deep Learning platform
15
NVIDIA DGX-1 SOFTWAREOptimized for Deep Learning Performance
Accelerated Deep Learning
cuDNN NCCL
cuSPARSE cuBLAS cuFFT
Container Based Applications
NVIDIA Cloud Management
Digits DL Frameworks GPU Apps
Research & Develop Deploy & ManagePackage & Test
16
DGX-1 IN THE WORKFLOW A complete GPU-accelerated deep learning workflow
MANAGE TRAIN DEPLOY
DIGITS
DATA CENTER AUTOMOTIVE
TRAINTEST
MANAGE / AUGMENTEMBEDDED
TENSOR RT (GIE)
MODEL ZOO
17
DGX-1 — 6 STEPS TO DEEP LEARNING
LoginMonitoring
PortalLaunch
ContainerInteracting with Jobs
Create Training
Training Run
LOGIN
18
DGX-1 — 6 STEPS TO DEEP LEARNING
LoginMonitoring
PortalLaunch
ContainerInteracting with Jobs
Create Training
Training Run
MONITORING PORTAL
19
DGX-1 — 6 STEPS TO DEEP LEARNING
LoginMonitoring
PortalLaunch
ContainerInteracting with Jobs
Create Training
Training Run
LAUNCH CONTAINER
20
DGX-1 — 6 STEPS TO DEEP LEARNING
LoginMonitoring
PortalLaunch
ContainerInteracting with Jobs
Create Training
Training Run
INTERACTING WITH JOBS
21
DGX-1 — 6 STEPS TO DEEP LEARNING
LoginMonitoring
PortalLaunch
ContainerInteracting with Jobs
Create Training
Training Run
CREATE TRAINING
22
DGX-1 — 6 STEPS TO DEEP LEARNING
LoginMonitoring
PortalLaunch
ContainerInteracting with Jobs
Create Training
Training Run
TRAINING RUN
23
NVIDIA EXPERTISE AT EVERY STEP
Solution ArchitectsGlobal Network
of PartnersDeep Learning
InstituteGTC
Conferences
1:1 support
Network training setup
Network optimization
Certified expert instructors
Worldwide workshops
Online courses
Epicenter of industry leaders
Onsite training
Global reach
NVIDIA Partner Network
OEMs
Startups
Need image
NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
NVIDIA DEEP LEARNING PLATFORM
GTX - DEVELOPMENT
DEEP LEARNING SDK
TESLA - DEPLOYMENT DGX1 - ENTERPRISE
Purpose: Deep Learnig Test, Development, Benchmarks, Small Neural Networks
Purpose: Deep Learnig applicationfor medium data set analysis and Medium Neural Networks
Purpose: Appliance NVIDIA DGX1 with Artificial Intelligence software. Large Neural Networks
NVIDIA SOFTWARE
Workstation High-End ServerMid-Range Server
27
Thirdy Party Development (VISION / SPEECH / BEHAVIOUR / FINANCE / IoT)
Object Detection
NVIDIA SOFTWARE
Mocha.jl
Image Classification Language TranslationRecommendation
Engines Sentiment AnalysisVoice Recognition
Workstation BOXX High-End Server
DEEP LEARNING FRAMEWORKS
Workstation High-End ServerMid-Range Server
• Purpose: Deep LearnigTest, Development, Benchmarks, Small NeuralNetworks
• Price 1 GPU: < 10K€
Purpose: Deep Learnigapplication for medium data set analysis and Medium Neural Networks• Price 2 GPU: < 40K€• Price 4 GPU: < 60K€
• Purpose: Appliance NVIDIA DGX1 with ArtificialIntelligence software, Large Neural Networks
• Price 8 GPU: < 150K€
NVIDIA SOFTWARE
Fino al 31/01/17
Fino al 31/01/17 è disponibile una promo EDU sui seguenti prodotti NVIDIA:
• K80• P100 (12GB)• P100 (16GB)• DGX1
I prodotti saranno disponibili anche su MEPA. Per informazioni scrivere a [email protected]