30
S8901 – Quadro for AI, VR and Simulation

S8901 – Quadro for AI, VR and Simulation - NVIDIA · 2018. 4. 11. · PCI Express Gen 3 x16: NVLink. 200 GB/sec Bidirectional | 25% improvement: Display Connectors. 4x DisplayPort

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • S8901 – Quadro for AI, VR and Simulation

  • Carl Flygare, PNYQuadro Product Marketing Manager

    Allen Bourgoyne, NVIDIASenior Product Marketing Manager

  • “The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.”Edsger Dijkstra

  • Intelligence Abounds in NatureA very small sampling

  • Technological IntelligenceHomo sapiens’ essential differentiator

    Thalmocortical brain network

    3 million neurons, 476 million synapses

    Full human brain

    106 billion neurons, 1,000 trillion synapses

  • Artificial Intelligence: Where we Stand TodayGoogle’s IQ is slightly below a six-year-old human’s

    Google 47.28 | 78.42% increase since 2014

    Baidu 32.92 | 40.08% increase since 2014

    Microsoft Bing 31.98

    Apple Siri 23.90

    Source: http://www.zdnet.com/article/google-ai-vs-siri-vs-bing-iq-tests-show-one-is-smartest-by-a-mile/

    AI IQ’s significantly lower than an 18-year-old’s average 97 score

    In 2014 two of the three researchers found Google had an IQ of 26.5 compared to Baidu’s 23.5

  • NVIDIA QuadroEvery segment benefits from AI, VR and simulation

    Manufacturing CAE Media and Entertainment Automotive

    AEC Energy (Oil and Gas) Scientific and Technical Healthcare

  • Entry

    NVIDIA Quadro | AI, VR and Simulation Open New Possibilities

    Small and Simple CAD Models, Entry PLM

    Medium Size and Complexity CAD Models,PLM, Basic DCC, Medical Imaging

    Professional VR, Complex CAD Models, CAE, PhotorealisticRendering, Complex DCC and VFX, Medical Imaging

    P4000 8 GB

    Professional VR, Very Complex CAD Models, CAE, PhotorealisticRendering, Advanced DCC and VFX, 3D Medical Imaging

    P5000 16 GB

    P6000 24 GBCollaborative VR, Extremely Complex CAD Models, CAE, PhotorealisticRendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging

    P620 2 GB

    P400 2 GB

    P2000 5 GB

    P1000 4 GB

    GP100 16 GBAI (Deep Learning) Development, Collaborative VR, CAE Simulations,Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute

    Basic Mid Range Upper Range High End Ultra High End

    GV100 32 GB

  • Entry

    NVIDIA Quadro | AI, VR and Simulation Open New Possibilities

    Small and Simple CAD Models, Entry PLM

    Medium Size and Complexity CAD Models,PLM, Basic DCC, Medical Imaging

    Professional VR, Complex CAD Models, CAE, PhotorealisticRendering, Complex DCC and VFX, Medical Imaging

    P4000 8 GB

    Professional VR, Very Complex CAD Models, CAE, PhotorealisticRendering, Advanced DCC and VFX, 3D Medical Imaging

    P5000 16 GB

    P6000 24 GBCollaborative VR, Extremely Complex CAD Models, CAE, PhotorealisticRendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging

    P620 2 GB

    P400 2 GB

    P2000 5 GB

    P1000 4 GB

    GP100 16 GBAI (Deep Learning) Development, Collaborative VR, CAE Simulations,Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute

    Basic Mid Range Upper Range High End Ultra High End

    GV100 32 GB

  • NVIDIA Quadro GP100NVIDIA Quadro GV100 | Reinventing the Workstation for AI

  • NVIDIA Quadro GP100NVIDIA Quadro GV100 x 2 | NVLink Scalable Workstation AI

  • NVIDIA Quadro GV100 and NVLinkScaling performance and memory*

    *Application support for NVLink required. Maximum of two GV100 boards can be connected with NVLink.

    High speed GPU and memory connection for GV100

    ▪ NVLink combines two GV100s for twice the compute power and 64 GB of memory

    ▪ Up to 200 GB/sec bidirectional bandwidth, 25% improvement

    ▪ Used in pairs, two dedicated NVLink connectors on GV100 boards

    ▪ Provides SLI functionality for GV100 boards

  • NVIDIA Quadro GV100Technical specifications

    GPU Architecture Volta

    CUDA and Tensor Cores 2560 (FP64), 5120 (FP32), 640 (Tensor)

    Memory Capacity 32 GB HBM2

    Peak Memory Bandwidth 870 GB/sec

    FP64 (Double Precision) 7.4 TFLOPS | 42% improvement

    FP32 (Single Precision) 14.8 TFLOPS | 44% improvement

    FP16 (Half Precision) 118.5 TFLOPS (Matrix Multiply with FP16 or 32 Accumulate)

    INT8 (Integer) 59.3 TOPS | 26% improvement

    System Interface PCI Express Gen 3 x16

    NVLink 200 GB/sec Bidirectional | 25% improvement

    Display Connectors 4x DisplayPort 1.4 with HDCP 2.2

    4K Display Support 4x 4096 x 2160 at 120 Hz with HDR

    5K Display Support 4x 5120 x 2880 at 60 Hz with HDR

    8K Display Support 2x 7680 x 4320 at 60 Hz with HDR

    VR Ready and Stereo Yes, Stereo via 3-pin mini-DIN Connector Bracket

  • NVIDIA Quadro GV100Unmatched compute capabilities

    INT8 59.3

    FP64FP32FP16

    7.414.8118.5

    TFLOPSTFLOPSTFLOPSTOPS

  • NVIDIA Quadro GV100Features and benefits relative to GP100

    GP100 GV100 Benefit

    GPU Architecture Pascal Volta Most powerful, efficient and AI optimized GPU

    CUDA Cores 3584 5120 Significantly greater compute and rendering performance

    FP64 Performance 5.2 TFLOPS 7.2 TFLOPS 1.4x greater FP64 compute performance

    Memory Size 16 GB HBM2 32 GB HBM2 2.0x memory capacity

    Memory Bus Width 4096-bit 4096-bit Radically advanced memory bus implementation

    Peak Memory Bandwidth 717 GB/sec 870 GB/sec Move data to and from GPU 1.2x faster

    Display Support 4x DP 1.4 + 1x DVI-D DL 4x DP 1.4 and HDCP 2.2 Supports four 4K, 5K or 8K displays, latest HDCP

    HDR Image Support Yes Yes More lifelike images

    Advanced Display Quadro Sync II Quadro Sync II Synchronize up to 8 GPUs per system

    VR Ready Yes Yes, GV100 implements full suite of hardware optimizations

    NVLink NVLink (First Generation) NVLink (Second Generation) Higher performance means lower latency

    Board Power 235 W 250 W Better performance per Watt

    Auxiliary Power Connector 8-pin PCIe 8-pin PCIe Simplified power supply connectivity

    Form Factor 4.4” H x 10.5” L Dual Slot 4.4” H x 10.5” L Dual Slot No significant mechanical or thermal changes

  • NVIDIA Quadro GV100Redefines state of the art across essential solutions

    Artificial Intelligence

    Tensor processor cores

    NVIDIA GPU deep learning stack

    ISV DL and ML framework optimization

    Iterate and innovate faster

    Reduce training time

    RTX Rendering

    Unrivaled FP32 performance

    Largest models in GPU memory

    AI accelerated photorealistic rendering

    Neural network character animation

    Apply AI to simultaneous video streams

    Compute

    Industry leading HPC capabilities

    Work with largest datasets

    Integrate simulation into design process

    Utilize generative design algorithms

    Fastest FEA, CFD, CEM available

    Immersive Visualization (VR)

    Includes VR hardware optimizations

    Full NVIDIA VRWORKS support

    Create new AI-augmented technologies

    Visualize the largest datasets

    Collaborative VR environments (Holodeck)

    Connect two GV100 boards with NVLink to provide 64 GB of memory and twice the GPU processing power in standard workstation enclosures

  • NVIDIA Quadro GV100RTX rendering lets you dream and create at the speed of thought

    Architectural Design

    Visualize cities or urban street scenes in every photorealistic detail

    Product Design

    Design with physically based lights and materials in realtime

    Media and Entertainment

    Perfect every shot with GPI accelerated and AI enhanced rendering

    Work at full fidelity, utilizing massive datasets with 2x larger memory capacity

    Master rendering projects interactively with AI (Deep Neural Network) technology

  • NVIDIA QuadroRTX supercharges rendering with AI accelerated denoising

    Denoising On

    20 Frames

    Denoising Off

    20 Frames

    Denoising Off

    290 Frames

    High quality results with fluid visual interactivity throughout the design process

  • NVIDIA QuadroCompanies working with NVIDIA’s OptiX AI denoiser technology

    Image courtesy of Isotropix, rendered with Clarisse and denoised with NVIDIA OptiX.

  • NVIDIA QuadroCAD and CAE workflow elements

    Design (CAD) Simulation (CAE) Post-ProcessingPre-Processing

  • NVIDIA Quadro GV100Benefit from the ultimate immersive experiences

    RTX Rendered Graphics Interactive Physics GPU-Accelerated AIRealtime Collaboration

    2x larger memory capacity lets you work with high fidelity, massive datasets (v. GP100)

    Benefit from unconstrained Holodeck experiences with full-featured VR performance and capabilities

  • NVIDIA Quadro GV100Realize new opportunities with AI

    32 GB or 64 GB capacity (NVLink) trains neural networks with massive datasets

    Develop with NVIDIA optimized Deep Learning frameworks and deploy with NGC interoperability and scalability

    Accelerate AI training and inferencing on workstations with Tensor cores and NVLink

    NGC

    Retail store inferencing with Quadro by DeepBlue Technology, China

    Development Aggregation Inferencing At-The-Edge

  • NVIDIA Quadro GV100 AI Training PerformanceUp to 2x improvement in Deep Learning training performance*

    GP100 Batch Size 256

    GV100 Batch Size 256

    Tensor Flow ResNet-50 Training IPS

    GV100 Batch Size 512

    *Based on TensorFlow Resnet-50 Training. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.

    400

    300

    200

    100

    500

    600

    700

    GP100 Batch Size 128

    GV100 Batch Size 128

    Caffe ResNet-50 Training IPS

    GV 100 Batch Size 256

    500

    400

    300

    200

    100

    600

    700

    800

  • NVIDIA Quadro GV100 Deep Learning Training PerformanceOver 2x improvement in Deep Learning training and inference performance*

    1

    Batch Size

    TensorRT ResNet-50 Inference

    8

    *Based on TensorFlow Resnet-50 Training, TensorRT ResNet-50 Inference tests. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.

    400

    300

    200

    100

    500

    600

    700

    GP100 Batch Size 256

    GV100 Batch Size 256

    Tensor FlowResNet-50 Training

    GV 100 Batch Size 512

    400

    300

    200

    100

    500

    600

    700

    2 4

  • NVIDIA Quadro GV100 Scientific Compute PerformanceMore than 2x improvement over the previous generation*

    GP100

    LAAMPS Atomic Fluid Benchmark

    GV100

    *Based LAMMPS molecular modeling benchmark. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.

    400

    300

    200

    100

    500

    600

    700

    FP32 FP64

    CUDA Basic Linear Algebra Solver Benchmark

    FP16

    1.0

    0.5

    1.5

    2.0

    CUBLABS 2560 x 2048 x 8192

  • NVIDIA Quadro GV100 CAE ExampleSignificant ANSYS Mechanical 19 Acceleration*

    *Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.

    0 1 2 3 4

    4 CPU Cores

    3 CPU Cores + GV100

    8 CPU Cores

    8 CPU Cores + GV100

    16 GPU Cores

    Base License | 1.0

    Base License | 2.65

    Base + 4 HPC Licenses | 1.71

    Base + 5 HPC Licenses | 3.90

    Power Supply Module (V19cg-1)

    Base + 12 HPC Licenses | 2.29

  • NVIDIA Quadro GV100 CAE ExampleStandout ANSYS Fluent 19 Acceleration*

    *Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.

    0 1 2 3 4 5 6

    4 CPU Cores

    3 CPU Cores + GV100

    8 CPU Cores

    16 CPU Cores + 2x GV100

    32 CPU Cores

    Base License | 1.0

    Base License | 1.53

    Base + 4 HPC Licenses | 1.78

    Base + 5 HPC Licenses | 4.71

    Pipes Model 9.6 Million Cells

    Base + 2 HPC Packs | 3.29

    8 CPU Cores + GV100 Base + 5 HPC Licenses | 2.67

    16 CPU Cores Base + 12 HPC Licenses | 2.74

    32 CPU Cores + 2x GV100 Base + 2 HPC Packs | 5.55

  • NVIDIA Quadro GV100 Rendering PerformanceSOLIDWORKS Visualize scales to over 29x faster than CPU*

    *Based on 2x GV100, Xeon E5-2697 v3, 14 cores at 2.6 GHz, 32 GB DRAM, Win 10 Pro 64-bit Fall Creator’s Update and NVIDIA driver version 390.77. Tests run at 4K UHD (3840 x 2160) resolution.

    0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

    CPU

    P4000

    P5000

    P6000

    GP100

    GV100

    2x GP100

    2x GV100

    P2000

  • NVIDIA Quadro GV100 Graphics PerformanceUp to 1.3x better than previous generation*

    *Based on SPECviewperf 12.2.2 results.

    1.4

    1.0

    0.6

    1.2

    0.8

    Quadro GP100 Quadro GV100

    geomean 3dsmax catia energy maya swmedicalcreo showcase snx

    0.4

    0.2

  • Slide Number 1Slide Number 2Slide Number 3Intelligence Abounds in Nature�A very small samplingTechnological Intelligence�Homo sapiens’ essential differentiatorArtificial Intelligence: Where we Stand Today�Google’s IQ is slightly below a six-year-old human’sNVIDIA Quadro�Every segment benefits from AI, VR and simulationNVIDIA Quadro | AI, VR and Simulation Open New Possibilities NVIDIA Quadro | AI, VR and Simulation Open New Possibilities NVIDIA Quadro GP100NVIDIA Quadro GP100NVIDIA Quadro GV100 and NVLink�Scaling performance and memory*NVIDIA Quadro GV100�Technical specificationsNVIDIA Quadro GV100�Unmatched compute capabilitiesNVIDIA Quadro GV100�Features and benefits relative to GP100NVIDIA Quadro GV100�Redefines state of the art across essential solutionsNVIDIA Quadro GV100�RTX rendering lets you dream and create at the speed of thoughtNVIDIA Quadro�RTX supercharges rendering with AI accelerated denoisingNVIDIA Quadro�Companies working with NVIDIA’s OptiX AI denoiser technologyNVIDIA Quadro�CAD and CAE workflow elementsNVIDIA Quadro GV100�Benefit from the ultimate immersive experiencesNVIDIA Quadro GV100�Realize new opportunities with AINVIDIA Quadro GV100 AI Training Performance�Up to 2x improvement in Deep Learning training performance*NVIDIA Quadro GV100 Deep Learning Training Performance�Over 2x improvement in Deep Learning training and inference performance*NVIDIA Quadro GV100 Scientific Compute Performance�More than 2x improvement over the previous generation*NVIDIA Quadro GV100 CAE Example�Significant ANSYS Mechanical 19 Acceleration*NVIDIA Quadro GV100 CAE Example�Standout ANSYS Fluent 19 Acceleration*NVIDIA Quadro GV100 Rendering Performance�SOLIDWORKS Visualize scales to over 29x faster than CPU*NVIDIA Quadro GV100 Graphics Performance�Up to 1.3x better than previous generation*Slide Number 30