View
215
Download
0
Category
Preview:
DESCRIPTION
Near-threshold Cores (NVt. Cores) Pros – Low power per-core. – More cores per-chip. Limitations – Low per-core frequency, reducing throughput gains from parallelization. – Variations, harmful for performance and functionality. Will NVt. cores be a viable solution to push down the power-wall? 3Liang Wang, ECE6332 Final
Citation preview
Attacking the Power-Wall by Using Near-threshold Cores
Liang Wangliang@cs.virginia.edu
Liang Wang, ECE6332 Final 2
Power Wall• The end of Classical Scaling.– Vdd: almost constant– Power density: roughly increase in exponential– Utilization: roughly decrease in exponential
• We can fabricate more cores than we can power up
* From Venkatesh, et. al. ASPLOS’10
Dark Silicon
Liang Wang, ECE6332 Final 3
Near-threshold Cores (NVt. Cores)
• Pros– Low power per-core.– More cores per-chip.
• Limitations– Low per-core frequency, reducing throughput gains
from parallelization.– Variations, harmful for performance and functionality.
Will NVt. cores be a viable solution to push down the power-wall?
Liang Wang, ECE6332 Final 4
Outline
• Performance Model• Analyses and Results • Conclusion
Liang Wang, ECE6332 Final 5
System Modeling
Core
Area: APower: P
Symmetric Multi-core System
),)(
min(aA
vpPn Number of
active cores
)( maxVfSserial
)(VfS parallel parallelserial SnS
speedup
11
Amdahl’s LawApplication with
parallel ration of
A Single corev
Area: aPower: p(v)
Freq: f(v)
Dynamic Power
Static Power
Frequency
2vfreq v10
Fitted to circuit sim.
Liang Wang, ECE6332 Final 6
Simulation Setup• Circuit
– A single inverter– Ripple carry Adder (32bits, 16bits, 8bits, and 4bits)
• Technology Library– A modified version of Predictive Technology Model (PTM)
• Technology Nodes– 45nm, 32nm, 22nm, 16nm
• Process Variants– HKMGS: High-performance High-K Metal Gate and Stress effect.– LP: Low-power process
• CAD Tools– RC Compiler– Spectre driven by Ocean
Liang Wang, ECE6332 Final 7
Voltage-Frequency Scaling
~8x
~400x
~15x
~103x
LP has much larger frequency drop-down comparing to HP withthe same change in vdd
16nm has larger frequency drop-down comparing to 45nmWith the same change in vdd
Liang Wang, ECE6332 Final 8
Design space exploration (Area)45nm, HKMGS, IO cores, 100w, =0.99
saturating
Peak is cappedby total area
2x Peak from200 to 6.4K
Liang Wang, ECE6332 Final 9
Cross-technology study500mm2
80W
400mm2
100W
Liang Wang, ECE6332 Final 10
Compare to Dark Silicon
• NVt. cores alleviate the issue of low utilization.• NVt. cores has better performance. (up to 2x)
500mm2
80WHKMGS
Available cores on-chip
Liang Wang, ECE6332 Final 11
Variation
• NVt. cores are very sensitive to variations– Functionality. (ratioed circuits)– Performance. (focused in this project)
• Monte-Carlo simulation– Performed on every VDD setups– 100 iterations per VDD– Process and mismatch
Liang Wang, ECE6332 Final 12
Voltage-Frequency Scaling Revisited
• HKMGS– Up to 5x slow down
• LP– Up to 10x slow down
• HKMGS– Up to 10x slow down
• LP– Up to 100x slow down
Liang Wang, ECE6332 Final 13
Impact of Variation400mm2, 100W, IO
Lower Utilization
Worse Perf.
Flatten Vdd
Liang Wang, ECE6332 Final 14
Conclusion
• In terms of performance– Simple core (IO) is better.– HP process (HKMGS) is better.
• Lowering VDD reduces dark silicon, improves throughput.
• Vulnerable to process variation.
Recommended