GPU Architecture Chris Vuong Long Pham
Agenda
1. What is GPU?
a. Dedicated vs Integrated GPUs
b. GPU structure vs CPU
2. How does GPU work?
3. History & Evolution of GPUs
a. Background
b. 1980’s
c. 1990’s
d. 2000’s
e. 2010’s and beyond
4. OpenGL vs DirectX
5. Recent and Future Trends
1.What is GPU?
- A graphics processing unit.
- Accelerates creation of images.
- Used in embedded systems, mobile
phones, desktops, workstations and game
consoles.
a.Dedicated Card vs Integrated Card
- Interfaces with motherboard by means of
an expansion slot such as PCIe or AGP
- Easily replaceable or upgradeable
- Has its own RAM
- Produces much more heat than IGPs
Multiprocessor Structure:
- N multiprocessors with M cores each
- SIMD (Single Instruction Multiple Data) -
Cores share an Instruction Unit with other
cores in the same multiprocessor
- Shared memory, constant cache, and
texture cache
How is a pixel drawn on the screen?
Example: 1 million triangles * 100 pixels per triangle * 10 lights * 4 cycles per light computation = 4 billion cycles
3. History & Evolution of GPUs
a)Background Information
b) 1980’s
c) 1990’s
d) 2000’s
e) 2010’s and beyond
f) Trends
a)Background Information
- Graphics pipeline: The stages through which the graphics data is sent
+ Usually consists of CPU software + GPU cores
+ 3D coordinates => 2D pixel space
+ Stages in between: Geometry, Rendering
- Adopted by major GPU manufacturers such as NVIDIA, ATI
- Original GPUs used graphics pipeline with GPU performing Rendering only
- Later on GPUs started to take more tasks in the pipeline
Early GPU Pipeline
b) 1980’s
- GPUs were “integrated time buffers”
- IBM Professional Graphics Controller (PGA)
+ One of first PC’s 2D/3D video cards
+ Despite mass-market failings, became pivotal in GPU evolution
- Features were added to early GPUs by 1987
- Silicon Graphics Inc. (SGI) emergence
+ Creation of API and OpenGL
c) 1990’s
- Generation 0:
+ SGI’s RealityEngine
+ Cheap Hardware & Games Combo
+ Performance improvements
- Generation I:
+ 3dfx Voodoo (1996)
c) 1990’s (continued)
- Generation II: Breakthroughs in the field
+ Released cards could perform the entire pipeline
+ Used Accelerated Graphics Port (AGP) in place of PCI
+ New graphics features
+ Propelled computer gaming and GPU hardware markets
+ Still have room for performance improvements (fixed-function pipeline)
3dfx Voodoo (1996)
- 1 million transistors - 4 MB of 64-bit DRAM
- Core clock 50 MHz
NVIDIA’s GeForce 256 (1999) - 23 million transistors - 32 MB of 128-bit DRAM - Core clock 120 MHz
d) 2000’s
- Generation III:
+ GeForce 3, Radeon 8500: First GPUs
with programmable pipeline
+ Still limited in programmability
- Generation IV:
+ 2002 - GeForce FX, Radeon 9700: Fully
programmable
- Generation V:
+ GeForce 6, Radeon X800
Improved GPU Pipeline
d) 2000’s (continued)
- Generation VI:
+ GeForce 8 series (namely GeForce
8800): Unified shaders
+ SM (Streaming Multiprocessor):
Calculation of vertex, pixel, geometry
- Generation VII:
+ Fermi architecture: More
programmable
+ GPGPU (General Purpose GPU)
Parallelism in CPUs vs GPUs
CPUs
- Task parallelism
- Multiple tasks map to multiple threads
- Tasks run different instructions
- 10s of relatively heavyweight threads
run on 10s of cores
- Each thread managed and scheduled
explicitly
- Each thread has to be individually
programmed
GPUs
- Data parallelism
- SIMD model
- Same instruction on different data
- 10,000s of lightweight threads on
100 cores
- Threads are managed and
scheduled by hardware
- Programming done for batches of
threads(ie, 1 pixel shader per group
of pixels, or draw call)
Why Unify?
e) 2010’s and beyond
- GPU consisted of highly parallel and programmable cores
+ Essentially multi-core, general purpose CPUs
- New cards characterized this:
+ NVIDIA’s Fermi-based GTX 580
+ AMD’s Fusion (CPU+GPUs=APU)
+ Intel’s Larrabee & SandyBridge CPUs integrated GPU
4. OpenGL vs DirectX
- Both APIs rely on the use of traditional graphics pipeline.
- DirectX is more than just a graphics API (OpenGL is), it has tools to deal with
sound, music, input networking and multimedia.
- DirectX is exclusively to Windows platform whereas OpenGL is completely
cross platform.
- OpenGL is faster because of smoother and efficient pipeline.
4. OpenGL vs DirectX
5. Recent and Future Trends
- Moore’s Law applies to the
GPU transistors as well
- The number of transistors
have stopped increasing
recently due to
manufacturing constraints
5. Recent and Future Trends
- Unified Shader Architecture (center around flexible processor core).
- Extremely high parallel stream processing.
- Higher programmable capability.
5. Recent and Future Trends
References Sources:
http://mcclanahoochie.com/blog/wp-content/uploads/2011/03/gpu-hist-paper.pdf
http://www.cs.virginia.edu/~gfx/papers/pdfs/59_HowThingsWork.pdf
http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf
http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf
http://cs.nyu.edu/courses/fall15/CSCI-GA.3033-004/ http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
Images:
http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
http://www.hardwarezone.com.sg/feature-nvidia-geforce-8800-gtx-gts-g80-worlds-first-dx10-gpu/embracing-unified-shader-architecture
https://www.cs.utah.edu/~jeffp/teaching/MCMD/S20-GPU.pdf
https://www.directron.com/blog/what-is-pcie/