View
217
Download
0
Category
Preview:
Citation preview
Shanker TrivediVice-President Worldwide PSG Sales
NVIDIA Corporation
What is NVIDIA Hybrid Computing Value Proposition?
Do GPUs apply to my work?
Why is HPC important to NVIDIA?
How does NVIDIA help ISV porting their code?
World leader in programmable graphics processor
technologies
One of the world’s largest semiconductor companies
6,000 Employees World Wide
$1B Annual R&D Investment
Leading the Visual Computing Revolution
The Best Graphics Processorsin the World
The Standard in Professional
Graphics
Revolutionary GPU for High Performance
Computing
The World’s Lowest Power HD Computer
GPUs for Small and Low
Power PCs
NVIDIA Product LinesProfessional Solutions Group
Hybrid Computing Applications
CPU + GPU
CUDANVIDIA’s Architecture for Hybrid Computing
NVIDIA GPUwith the CUDA Parallel Computing Architecture
CUDA C/C++
OpenCLDirect
ComputeFortran Python,
Java, .NET, …
Over 60,000 developers
Running in Production since 2008
SDK + Libs + Visual Profiler and Debugger
1st GPU demo
Shipped 1st OpenCL Conformant Driver
Public Availability
SDK + Visual Profiler
Microsoft API forGPU Computing
Supports all CUDA-Architecture GPUs (DX10 and DX11)
PyCUDA
jCUDA
CUDA.NET
OpenCL.NET
PGI Accelerator
PGI CUDA Fortran
CAPS HMPP
HPC-Project Par4All
Quadro Visual Computing Platform
NVIDIA
Quadro PlexNVIDIA
SLI
NVDIA
G-Sync
NVIDIA
HD SDI
NVIDIA
CUDA
Architecture
C/CUDA
OpenCL
DX Compute
CgFX SLI Mosaic
SLI Multi-OS
30-bit Color
Q Buffered
Stereo
PhysX
(Physics)
SceniX
(Scene
Management)
CompleX
(Scene
Scaling)
OptiX
(Ray Tracing)
mental ray
(Rendering)
Reality
Server
(3D Web
Services)
Performance
Drivers
(AutoCAD
3DS Max)
AXE (Application Acceleration Engines)Display AppHybrid
Computing
3D Vision
Pro
Hardware Solutions
NVIDIA TeslaInfinite Possibilities in High Performance Computing
Data Center Products Passive heatsink
Single user WorkstationActive heatsink
®
Wide Market Adoption
Do GPUs apply to my work?
Successfully used in supercomputing
Research quality of software
Remarkable results from students
Key Research Applications Available on GPUs
Computational Fluid Dynamics
Molecular Dynamics / Quantum Chemistry
AstrophysicsWeather &
Climate ModelingMany More
OpenCurrentBAE Systems
AcusimEuler Solvers
Lattice BoltzmanNavier Stokes
AMBERABINIT
DL_POLYGROMACSLAMMPS
NAMDTeraChem
N-bodyChimera
GADGET2Many published
papers
ASUCA (Japan)CO2 Modeling
(Japan)HOMME
Tsunami modelingNOAA NIM
WRF
• Materials Science• DCA++• gWL-LSMS
• Combustion• S3D
• Lattice QCD• Chroma (QUDA)
Introducing Tesla Bio WorkBench
Applications
Community
Platforms
Tesla Personal Supercomputer Tesla GPU Clusters
MUMmerGPU
LAMMPS CUDA-BLASTP
TeraChem
CUDA-EC
CUDASW++
Hex (Docking)
Molecular Dynamics & Quantum Chemistry Bio-Informatics
Technical
papers
Discussion
Forums
Benchmarks
& Configurations
Developer
Tools
Website
(SW, Docs)
NVIDIA in Oil & Gas WorkflowAccelerating Time to Discovery
Quadro Value•Large Scale Visualization
•Transparent Scalability
•Virtualization with Full
Acceleration
•Secure Collaboration
Public ReferencesSchlumberger,
Halliburton, Paradigm,
Seismic Micro
Technologies, Global
Exploration Companies
Quadro Additives•SLI Mosaic Mode
•SLI MultiOS
•NVScale Multi-GPU
•3D Immersive support
•QuadroPlex
Seismic
Interpretation
Tesla/Quadro Value•Reduce Cycle Time
•More Iterations
•Improved Scalability
•Streamlined Operations
•Better Oil/Gas Recovery
Public ReferencesConocoPhillips,
Polyhedron, French
Institute for Petroleum,
Elegant Mathematics
Tesla/CUDA
Additives•Scalable Iterative Solvers
•Enhanced Pre-conditioning
•Double Precision Support
•Sparse Matrix Vector
Multiply Support
Reservoir
SimulationSeismic
Processing
Tesla/CUDA Value•Improve Throughput
•Reduce Operating Costs
•Enhance Subsurface Image
•Optimize Acquisition
Parameters
Public ReferencesHess, Chevron, TOTAL,
CGGVeritas, Petrobras,
Seismic City, Acceleware,
OpenGeoSolutions
Tesla/CUDA
Additives•Kirchhoff Migration
•Wave Equation Migration
•Reverse Time Migration
•Spectral Decomposition
Tesla/Quadro Value•Improve Simulation
•Add Gravity Calculations
•Reduce Non-Productive
Time, Increase Revenue
•Reduce Operating costs
Public ReferencesAnsys, Acceleware,
Accelereyes (MATLAB),
UCLA Institute of
Geophysics, Rice/Brown
Collaboration
Tesla/CUDA
Additives•Enhanced Computation
Fluid Dynamic Simulation
•Dense Matrix Acceleration
•Multi-GPU Scalability
Well Planning
Drilling
GPUs Enable Faster Deployment of New Seismic Algorithms
1
10
100
1000
10000
100000
1000000
1995 2000 2005 2010 2015 2020
Rela
tive C
om
pu
te R
eq
uir
ed
Year
CAZ WEM (VTI)
KPrSTM (TTI)
KPrSDM (TTI)
Gaussian Beam
Wave Equation
Reverse Time Migration
Elastic Wave Propagation
Shot WEM (TTI)
Wave Equation
Reverse Time Migration
Elastic Wave Propagation
Schedule pull-in
due to GPUsGPUs CPUs
Visualization
The GPU helps the entire Oil & Gas Industry
Data Center
Hybrid
Computing
Seismic
AcquisitionSeismic
Processing
Seismic
ImagingInterpretation
Reservoir
CharacterizationPetrophysics
Well
PlanningDrilling
Reservoir
EngineeringEconomics
Power Wall Workstation
NVIDIA Solutions
Why is HPC important to NVIDIA?
Drives innovation in the GPU
Incredible growth opportunity
Solving really important problems
0%
20%
40%
60%
80%
100%
2001 2002 2003 2004 2005 2006 2007
Clu
ster
Revenue S
hare
by P
rocess
or
Type
RISC EPIC x86
Commodity CPUs Dominate HPC
2X every 18 months AND cheap!
Source: IDC 2008
The Performance Gap Widens Further
2003 2004 2005 2006 2007 2008 2009 2010
Peak Single Precision Performance GFlops/sec
Tesla 8-series
Tesla 10-series
Westmere
3 GHz
Tesla 20-series
2003 2004 2005 2006 2007 2008 2009 2010
Peak Memory Bandwidth GB/sec
Tesla 8-series
Tesla 10-series
Westmere
3 GHz
Tesla 20-series
6x double precision
ECC
L1, L2 Caches
1 TF Single Precision
4GB Memory
NVIDIA GPU
X86 CPU
GPU
T8
128 core
T10
240 core
A 2015 GPU *
~20x the performance of today’s GPU
~5,000 cores at ~3GHz (50mW each)
~20 TFLOPS
~1.2TB/s of memory bandwidth
* This is a sketch of a what a GPU in 2015 might look like, it does not reflect any actual product plans
GPU Revolutionizing Computing
GFlops
T20
512 core
Top 500 with Hybrid Systems4x Cheaper, 4x Less Space and 4x Less Power Consumption
130 GPUs
41 TFlops
$600K
Top 150
170 GPUs
53 TFlops
<$1M
Top 100
330 GPUs
103 TFlops
<$2M
Top 50
2x 42U 3x 42U
6x 42U
#2: Dawning Nebulae
1.27 Petaflops Linpack
4,640 Tesla GPUs
2.55 MWatts
2x Performance / Watt
0
1
2
3
4
5
6
7
8
0 500 1000 1500 2000
Power MegaWatts
Linpack Performance (Teraflops)
Jaguar
x86 CPU
Roadrunner
CellJUGENE
BlueGene
Nebulae
Tesla GPU
IPE, CAS
Tesla GPU
Scaling to 5 PetaFlop Cluster
0
5
10
15
20
25
0 1000 2000 3000 4000 5000 6000
Power MegaWatts
Linpack Performance (Teraflops)
Jaguar
x86 CPU
Roadrunner
Cell
JUGENE
BlueGene
Nebulae
Tesla GPU
20 Mwatt
x86 CPU
10 Mwatt
Tesla GPU
How does NVIDIA help ISVs porting their code?
Multi-API strategy
Rich ecosystem
Professional development tools
Hybrid Computing Developer EcosystemDebuggers& Profilers
cuda-gdbNV Visual Profiler Nsight for MS VS
AllineaTotalView
MATLABScilab (end ‘10)MathematicaNI LabView
pyCUDA
Numerical Packages
CC++
FortranOpenCL
DirectComputeJava
Python
GPU Compilers
PGI AcceleratorCAPS HMPP
Par4AllmCUDA
ParallelizingCompilers
BLASFFT
LAPACKNPP
VideoImaging
Libraries
CUDA Consultants & Training Solution Providers
nv-smiBright Cluster
Rocks RollScyid
PBS ProCluster Manager
Cluster Management
Hybrid Computing Applications
CPU + GPU
Broad Adoption
CUDANVIDIA’s Architecture for Hybrid Computing
Over 180,000,000
installed CUDA-
Architecture GPUs
Over 200k Toolkit
downloads (v3.0)
Windows, Linux and
MacOS platforms
supported
Hybrid Computing spans
HPC to Consumer
300+ Universities
teaching Hybriid
Computing on the CUDA
Architecture NVIDIA GPUwith the CUDA Parallel Computing Architecture
CUDA C/C++
OpenCLDirect
ComputeFortran Python,
Java, .NET, …
Over 60,000 developers
Running in Production since 2008
SDK + Libs + Visual Profiler and Debugger
1st GPU demo
Shipped 1st OpenCL Conformant Driver
Public Availability
SDK + Visual Profiler
Microsoft API forGPU Computing
Supports all CUDA-Architecture GPUs (DX10 and DX11)
PyCUDA
jCUDA
CUDA.NET
OpenCL.NET
PGI Accelerator
PGI CUDA Fortran
CAPS HMPP
HPC-Project Par4All
CUDA C/C++ Leadership: 7 Releases in 3 Years
2007 2008 2009 2010
July 07 Nov 07 April 08 Aug 08 July 09 Nov 09 Mar 10
CUDA Toolkit 1.1
• Win XP 64
• Atomics support
• Multi-GPU
support
CUDA Toolkit 2.0
• Double Precision
• Compiler
Optimizations
• Vista 32/64
• Mac OSX
• 3D Textures
• HW Interpolation
CUDA Toolkit 2.3
• DP FFT
• 16-32 Conversion
intrinsics
• Performance
enhancements
CUDA Toolkit 1.0
• C Compiler
• C Extensions
• Single Precision
• BLAS
• FFT
• SDK
40 examples
CUDA
Visual Profiler 2.2
cuda-gdb
HW Debugger
Parallel Nsight
BetaCUDA Toolkit 3.0
• C++ inheritance
• Fermi arch support
• Tools updates
• Driver / RT interop
NVIDIA OpenCL Execution2009 2010
April June Aug Sept Nov March
OpenCL 1.0
R190 UDA
Conformant release
• 2D Imaging
• Global atomics
• Compiler flags
• Compute Query
• Byte Addr. Stores
OpenCL SDK &
CUDA Toolkit 2.3
OpenCL
Visual Profiler
OpenCL
Prerelease Driver
OpenCL SDK
OpenCL 1.0
R195 UDA
• Double Precision
• OpenGL Interop
++ Performance
Enhancements
OpenCL SDK &
CUDA Toolkit 2.3
OpenCL 1.0
R195 UDA #2
• ICD
• Direct3D9 sharing
• Direct3D10 sharing
• Direct3D11 sharing
• Pragma unroll
• Local atomics
OpenCL SDK &
CUDA Toolkit 2.3
OpenCL
Conformant Driver
OpenCL SDK
Targeting Multiple Platforms with CUDA
CUDA C / C++
NVCCNVIDIA CUDA Toolkit
MCUDACUDA to Multi-core
OcelotPTX to Multi-corePTX
MCUDA: http://impact.crhc.illinois.edu/mcuda.php
Ocelot: http://code.google.com/p/gpuocelot/
Swan: http://www.multiscalelab.org/swan
SwanCUDA to OpencL
AMD GPU
Multi-Core
CPUs
NVIDIA
GPUs
Parallel Nsight
Visual Studio
Visual Profiler
For Linux
cuda-gdb
For Linux
NVIDIA SDKs
Finance
Oil & Gas
Video/Image Processing
3D Volume Rendering
Particle Simulations
Fluid Simulations
Math Functions
Hundreds of code samples for CUDA C, DirectCompute, and OpenCL
NVIDIA is committed to Scilab CUDA!
September 21-23, 2010
San Jose, California
The most important event in the GPU Ecosystem!
www.nvidia.com/gtc
Recommended