Upload
inside-bigdatacom
View
977
Download
0
Embed Size (px)
Citation preview
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
VegaAMD’s Next-Generation GPU Architecture
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
We want rich, lavish virtual worlds.
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
We want to create with limitless detail, in real time.
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
We want to make decisionsbased on exabytes of datain an instant.
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
GPUs are taking on more diverse workloads
WORKSTATIONPhysically Based Rendering
Physics ModelingLoom (VR)
Hi-Res HDR Content Creation
GAMING4K VR
ConsolesNew Rendering Pipelines
New APIseSports
COMPUTEMachine Learning
Image Recognition / Computer VisionNatural Data Processing
GPU
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Conventional architectures are not scaling to meet needs
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Game install sizes are expanding exponentiallyGi
gaby
tes
Rela
tive
Data
Size
Deus Ex Series Install Disk Size (source: Steam)CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Chart for illustrative purposes2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
The Lord of the Rings
Fellowship of the Ring
The HobbitAn Unexpected
Journey
Pro graphics data sets are well into the petabytesPe
taby
tes
Rela
tive
Asse
t Size
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Avatar
The HobbitThe Desolation of
SmaugThe HobbitBattle of the Five
Armies
The BFG
See endnote for details
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Compute workloads have shot into the exabytes
Character Recognition Object DetectionImage Recognition
Image/Video Recognition
1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017
See endnote for details
Data Point Too Big to Illustrate
Exab
ytes
Rela
tive
Trai
ning
Set
Size
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Growth in processing power is outpacing growth in memory capacity
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Relative GPU Compute (GFLOPS)
Relative GPU Storage Capacity
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST See endnote for details
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Introducing the world’s most scalable GPU memory architecture
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
High-Bandwidth Cache
High-Bandwidth Cache
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
bandwidth per pin*
2XHBM2
*vs HBM
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
HBM2
Over 50% smaller footprintHBM2 vs. GDDR5
8X Capacity/stack*
*vs HBM
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
High-Bandwidth Cache Controller
High-Bandwidth Cache
HBCC
NV RAM
Netw
orkStorage
System DRAM
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
virtual address space
512 TBHigh-BandwidthCache Controller
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Adaptive, fine-grained data movement
High-BandwidthCache Controller
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Total Allocations Accessed
(Ultra 4K)
Time
(Ultra 4K)
See endnotes for details
Time
Grap
hics
Mem
ory
Grap
hics
Mem
ory
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTImage from Deus Ex: Mankind Divided™ courtesy of Eidos Montreal
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST Image from Deus Ex: Mankind Divided™ courtesy of Eidos Montreal
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
New ProgrammableGeometry Pipeline
High-Bandwidth Cache
HBCC
NV RAM
Netw
orkStorage
System DRAM
GeometryPipeline
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
peak throughputper clock2XOver
New ProgrammableGeometry Pipeline
See endnotes for details
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Primitive Shaders
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Improved Load Balancing
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Next-Generation Compute Engine
High-Bandwidth Cache
HBCC
NV RAM
Netw
orkStorage
System DRAM
GeometryPipeline
ComputeEngine
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
I N T R O D U C I N G
Vega NCUNext-Generation Compute Unit
5128-bit opsper clock
25616-bit opsper clock
12832-bit opsper clock
Double Precision Rate is Configurable
*See endnotes for details
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Rapid Packed MathSupercharges performance of emerging workloads
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
NCU is optimized for higher clock speeds and higher IPC
CU*
NCU
*See endnotes for details
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
We’ve been working on reducing memory bandwidth consumption for many years
Texture & Color CompressionFastZ ClearHiZ
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Next GenerationPixel Engine
High-Bandwidth Cache
HBCC
NV RAM
Netw
orkStorage
System DRAM
GeometryPipeline
ComputeEngine
PixelEngine
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Fetch once enabled by smart primitive rasterization with on-chip bin cache
Shade once enabled by culling of pixels invisible to final scene
Draw Stream Binning RasterizerDesigned to improve performance and saves power
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
L1
Compute
Engine
Pixel
Engine
Geometry
Engine
L1
L1
L2
Memory Controller
GDDR
5
GDDR
5
GDDR
5
GDDR
5
GDDR5
GDDR5
GDDR5
GDDR5
Legacy Architecture –Non-coherent Pixel and Texture Memory Access
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Render back-ends are now clients of the L2 cache.
High-Bandwidth Cache
HBCC
NV RAM
Netw
orkStorage
System DRAM
GeometryPipeline
ComputeEngine
PixelEngine L1
L1
L1L2
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
Helps improve performance with applications that use deferred shading.
High-Bandwidth Cache
HBCC
NV RAM
Netw
orkStorage
System DRAM
GeometryPipeline
ComputeEngine
PixelEngine L1
L1
L1L2
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
L1
Compute Engine
Pixel Engine
Geometry Pipeline
L1
L1
L2
High BandwidthCache Controller
High-BandwidthCache
NVRAM
Network Storage
System DRAM
CPU MM Display XDMA PCIe®
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM ESTCONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
New Programmable Geometry Pipeline
RevolutionaryHigh Bandwidth Cache
AdvancedPixel Engine
VegaGPU Architecture for the Immersive and Instinctive Computing Era
Next-Gen Compute Unit
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
ENDNOTES
Pro graphics data set slide: Data provided by a third party studio and not verified by AMD. Data is historic - cinema asset data sizes for Lord of the Rings (2001) @.15PB, Avatar (2009) @ 1 PB, The Hobbit –Part1 (2012) @ 1.4 PB, The Hobbit –Part2 (2013) @ 1.8 PB, The Hobbit –Part3 (2014) @ 2.3 PB, and The BFG (2016) @ 3 PB. VG-6
Compute workload data set slide: Typical word character recognition data set defined as 18.3 MB (http://wordnet.princeton.edu/wordnet/download/old-versions/ ). Object Identification datasets defined as 490MB (http://host.robots.ox.ac.uk/pascal/VOC/databases.html#VOC2005_1). Image recognition defined as 144 GB (http://www.image-net.org/challenges/LSVRC/2010/download-all-nonpub). Image and video recognition datasets defined as 144 GB (http://image-net.org/challenges/LSVRC/2015/) Natural Data Analysis datasets defined as 2.5QB(http://www.vcloudnews.com/every-day-big-data-statistics-2-5-quintillion-bytes-of-data-created-daily/). VG-7
Growth in processing power slide: Data based on historic product specs; GPU relative frame buffer size vs relative TFLOP capability. The ATI Radeon 9700 Pro was 0.026 TFLOPs with 128 MB framebuffer. The ATI Radeon X950 XT was 0.08 TFLOPs with 256 MB framebuffer. The ATI Radeon X1900 XT was 0.375 TFLOPs with 512 MB framebuffer. The ATI Radeon HD 2900 XT was 0.4755 TFLOPs with 512 MB framebuffer. The ATI Radeon HD 4870 XT was 1.2 TFLOPs with 512 MB framebuffer. The ATI Radeon HD 5870 was 1.2 TFLOPs with 512 MB framebuffer. The AMD Radeon HD 7970 was 3.79 TFLOPs with 3 GB framebuffer. The AMD Radeon R9 290X was 5.63 TFLOPs with 4 GB framebuffer. The AMD Radeon R9 Fury X was 8.6 TFLOPs with 4 GB framebuffer. VG-5
Witcher 3 and Fallout 4 data slide: Data based on AMD Internal testing of an early Vega sample using an AMD Summit Ridge pre-release CPU with 8GB DDR4 RAM, Vega GPU, Windows 10 64 bit, AMD test driver as of Dec 5, 2016. Results may vary for final product, and performance may vary based on use of latest available drivers. VG-4
Geometry throughput slide: Data based on AMD Engineering design of Vega. Radeon R9 Fury X has 4 geometry engines and a peak of 4 polygons per clock. Vega is designed to handle up to 11 polygons per clock with 4 geometry engines. This represents an increase of 2.6x. VG-3
CU vs Vega NCU slide: Discrete AMD Radeon™ and FirePro™ GPUs based on the Graphics Core Next architecture consist of multiple discrete execution engines known as a Compute Unit (“CU”). Each CU contains 64 shaders (“Stream Processors”) working together. GD-78
CONFIDENTIAL | UNDER NDA UNTIL JANUARY 5, 2017, 9AM EST
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
ATTRIBUTION© 2016 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, CrossFire, FreeSync, Radeon and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. DirectX is a registered trademark of Microsoft Corporation in the US and other jurisdictions. 3DMark is a registered trademark of the Futuremark corporation.PCIe is a registered trademark of PCI-SIG Corporation. Vulkan and the Vulkan logo are trademarks of Khronos Group Inc. Other names are for informational purposes only and may be trademarks of their respective owners. DOOM® images and logos © 2016 Bethesda Softworks LLC, a ZeniMax Media company. DOOM and related logos are registered trademarks or trademarks of id Software LLC in the U.S. and/or other countries. All Rights Reserved.
Deus Ex: Mankind Divided™ images and logos © 2016 Square Enix Ltd. All Rights Reserved Deus Ex: Mankind Divided, Square Enix and Eidos are trademarks of the Square Enix Group.
DISCLAIMERS & ATTRIBUTIONS