Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Tesla: Fastest Processor Adoption in HPC History
Jen-Hsun Huang
Co-founder, President and CEO of NVIDIA
June 30th 2009
19955,000 triangles/second800,000 transistors GPU
2008350 Million triangles/second1.4 Billion transistors GPU
L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1
L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1
GPU for Computing
Massively parallel, throughput architecture
Science in Desperate for Computing Throughput
1982 1997 2003 2006 2010 2012
1,000,000,000
1,000,000
1,000
1
Gigaflops
Estrogen Receptor36K atoms
F1-ATPase327K atoms
Ribosome2.7M atoms
Chromatophore50M atoms
BPTI3K atoms
Bacteria100s of
Chromatophores
1 Exaflop
1 Petaflop
Ran for 8 months to simulate 2 nanoseconds
Power Crisis in Supercomputing
1982 1996 2008 2020
Exaflop
Petaflop
Teraflop
Gigaflop
Household Power
Equivalent
City
Town
Neighborhood
Block
7,000,000 Watts
25,000,000 Watts
850,000 Watts
60,000 Watts
Jaguar
Los Alamos
The GPU Computing Discontinuity
0
200
400
600
800
1000
1200
9/22/2002 2/4/2004 6/18/2005 10/31/2006 3/14/2008
Gflops(log scale) NVIDIA GPU
Intel CPU
Tesla 8-series
Tesla 10-series
Intel Xeon Quad-core 3 GHzIntel Pentium 4
3.2 GHz
Intel Pentium 4Dual-core 3.0 GHz
Intel Core2Dual-core 3.0 GHz
Double Precision
debut
4 cores
Advent of GPU Computing
CPU + GPU Co-Processing
GPU Computing Applications
C
C++
Java
FortranOpenCLtm DirectX
Compute
NVIDIA GPUCUDA Parallel Computing Architecture
OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc.
CUDA GPU Computing Architecture
CUDA Ecosystem
Applications Libraries
FFTBLAS
LAPACKImage processingVideo processingSignal processing
Vision
Consultants OEMs
Languages
C, C++DirectXFortranJava
OpenCLPython
Compilers
PGI FortranCAPS HMPP
MCUDAMPI
NOAA Fortran2COpenMP
UIUCMIT
HarvardBerkeley
CambridgeOxford
…
IIT DelhiDortmundtETH Zurich
Uni. PerpignanEcole CentraleParis 6 Jussieu
…
Over 200 Universities Teaching CUDA
Oil & Gas Finance
Medical Biophysics
Numerics
Imaging
CFD
DSP EDA
Debuggers
AlineaTotalView
Tesla GPU Computing Products
Tesla S1070
1U System
Tesla C1060
Computing Board
GPUs 4 Tesla GPUs 1 Tesla GPU
Single Precision
Performance4.14 Teraflops 933 Gigaflops
Double Precision
Performance346 Gigaflops 78 Gigaflops
Memory 16 GB (4 GB / GPU) 4 GB
M$
Performance
100x
1x
10,000x
TraditionalCPU Cluster
CPU Workstation
K$
TeslaPersonal
Supercomputer
Tesla Co-processing
Cluster
New Class of Co-Processing Supercomputers
2 TeslaM1060 GPUs
Up to 18 Tesla M1060 GPUs
Bull Bullx
Blade Enclosure
SuperMicro 1U
GPU Server
Finance: Equity Pricing
2 Tesla S1070 500 CPU Servers
2.8 kWatts 37.5 kWatts
$24 K $250 K
16x Less Space
13x Lower Power
10x Lower Cost
Equal Performance1 1
Oil & Gas: Seismic Processing
~$400 K ~$8 M
45 kWatts 1,200 kWatts27x Lower Power
20x Lower Cost
Equal Performance1 1
32 Tesla S1070 2,000 CPU Servers31x Less Space
192 TFlops GPU
256 TFlops GPU
CEA-DAM CCRT
TOTAL Seismic Processing
Tesla: Helping solve the critical HPC challenges
GPU
T8
128 core
T10
240 core
A 2015 GPU *
~20x the performance of today’s GPU
~5,000 cores at ~3GHz (50mW each)
~20 TFLOPS
~1.2TB/s of memory bandwidth
* This is a sketch of a what a GPU in 2015 might look like, it does not reflect any actual product plans
GPU Revolutionizing Computing
GFlops
GPU Technology Conference
Sept 30 – Oct 2, 2009
San Jose, CA
www.nvidia.com/gtc
We bring Solutions to your Questions
Thank You!
NVIDIA GPU Computing Links
NVIDIA CUDA Zone
NVIDIA High Performance Computing Solutions
NVIDIA Tesla S1070 – Product Description
NVIDIA Tesla C1060 – Product Description
Tesla Personal Supercomputer
Tesla Personal Supercomputer – Where to Buy?
YouTube – Tesla videos
Jean-Christophe Baratault
EMEA GPU Computing Sales
Cell +33 6 8036 8483