Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
© 2002 IBM Corporation
IBM Research
April 2004
Multi-Level Design of Energy-Efficient Adaptive Networks for Large Computing Systems
Juan-Antonio CarballoIBM Research
Communications Research IBM Research
IBM Research – Communications Technology
IBM Research Worldwide
! funky
Communications Research IBM Research
IBM Research – Communications Technology
Research's Strategic Thrusts """" Vital to IBM�s Future
Servers & Embedded Systems
Exploratory Science
Storage SystemsServices & Software
Technology
Personal Systems
Communications Research IBM Research
IBM Research – Communications Technology
Overview
! Future large computing systems: the PERCS projectVision " adaptability, productivity, performance
Scope " application focus, integration! The importance of networking and communications
Power as the new performance ! Energy-efficient, productive interconnect design
Multi-level communications design optimization
Adaptive communications architectures
Communications Research IBM Research
IBM Research – Communications Technology
PERCS """" Design Constraints
! Legacy investments! Looming technology crisis! HPC customer diversity! Business model
Must do well both on commercial and scientific workloads! Cost issues
Threat of commoditization! Productivity as a main theme
Communications Research IBM Research
IBM Research – Communications Technology
IBM�s Vision
A dynamic system that adapts to application needs
The strategy� Aggressive productivity targets� Commercial viability� Link into product cycle toward end of phase 2
Communications Research IBM Research
IBM Research – Communications Technology
PERCS """" Scope
! Application focusCommercial
Security
Bioinformatics
Data streaming
New 2010-apps ??! Integrated solution
HPCProgramming & user interface
System softwareArchitectureTechnology
Communications Research IBM Research
IBM Research – Communications Technology
2002 2004 2006 2008 2010To
tal B
andw
idth
(Pb/
s)
Importance of Communications Hardware
! Important as a chip marketOne of industry�s key segments
Source: Gartner
20,000
40,000
60,000
80,000
100,000
2002 2004 2006
Mar
ket s
ize
($K
)
Data processing
Communications
Consumer
! Key to server / large computersHigh-BW: distinguishing feature
Source: IBM large computing projects
Communications Research IBM Research
IBM Research – Communications Technology
A Multi-Level Power Management Problem
Switch
Switch
Host
Link Tx Link Rx
Networkdesign
Circuitdesign Output
logic
Recovered data
Sample memory
Sampling latches
Phase rotator control (state machine)
Phase rotator
Edge detection
Sample rate Memory Size Averaging rate
Control rate / flywheelNumber steps
Receive circuitry
Switch
Link Rx Circuit
Switch
HostHost
Linkdesign
µW
mW
W
Communications Research IBM Research
IBM Research – Communications Technology
Designing High-Bandwidth Links is Difficult
! ChallengesHigh-speed, low-power, low-BER, many customersUncertainties in package, channel, chip manufacturingMixed-signal system sensitive to these uncertainties
Serial ATA, Copper trace, <<1 meter
Fiber Channel, Optics, 1 Km
InfiniBand, Copper, 1 meter
Under-80mW Gbit/s serial link core
Communications Research IBM Research
IBM Research – Communications Technology
Communications in HPC """" Key Questions
! Goal: to achieveHigh productivity (use, design)Commercial viabilityCompetitive performance at acceptable power
Communications Research IBM Research
IBM Research – Communications Technology
Communications in HPC """" Key Focus Areas
1. Adaptability to application requirementsWorkloads, protocols, channels, speeds
"""" viability (applications), productivity (reuse)
2. Adaptability to environment variationsManufacturing quality, post-manufacturing conditions
"""" viability (yield), productivity (design)
3. Hardware design productivitySingle design, single design methodology
"""" productivity (design)
Communications Research IBM Research
IBM Research – Communications Technology
Multi-Level Communications Design
Circuit design
Link design
Custom digitalCustom analog
Self-adaptive links
On-core problem determination
1/2-custom islands
Level Design strategy Adaptability Productivity
Manufacturing variations10% better power
Application, environment50% better power
Post-manufacturingenvironment
Supply variations25% better power
Design convergence
Single design
Debugging
Design convergence
Communications Research IBM Research
IBM Research – Communications Technology
0%
5%
10%
15%20%
25%
30%
35%
a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14
Application
Mar
gin
for r
equi
red
BER
BER=10-17
BER=10-15
BER = 10-12
1. Adaptive Communications Links
! Application requirements, environment impact performanceWorkload, frequency, quality of channel, package, chip layout
�Difficult�
�Easy�
Communications Research IBM Research
IBM Research – Communications Technology
1. �Difficult� Versus �Easy� Requirements
Shorter wiresLonger wiresChip layout
Ceramic, high quality
Low-cost low-yieldPackage
Short chip-to-chipLong across boardsChannel
LowHighFrequency
Uniform-frequency pattern
Fast-varying frequency pattern
Workload
EasyDifficult
Communications Research IBM Research
IBM Research – Communications Technology
1
10
100
1000
10000
0.30 0.35 0.40 0.45 0.50
1. Requirement-Based Design of Adaptive Links
Requirements space
Requirement measurement blocks
Voltage
Frequency
Complexity
Link Design space
Configurable link CDR blocks
Mapping
Power modes
High-frequency jitter
Freq
uenc
y of
fset
Communications Research IBM Research
IBM Research – Communications Technology
1. Jitter (Eye Closure/Movement) Determines Difficulty
Communications Research IBM Research
IBM Research – Communications Technology
1. Adaptive Link Receiver
Data out(in chip)
OversamplingSample memory
Phase control (state machine)
Phase generation
Receive circuitry
Edge detection
CDRLoop
Data select + output
Receiver
Regulator
Complexity, V, frequency control
Complexity frequencyVdd
Data In (from channel)
Environment quality measurement
Loop stats generator
Communications Research IBM Research
IBM Research – Communications Technology
1. Example: Adaptive Loop Latency
! Low design overheadSimple clock gating + control logic
Phase control state machine
Phase state
Mode
/n
/2
Clk
Edge detection
1
2
Communications Research IBM Research
IBM Research – Communications Technology
0%
5%
10%
15%
20%
1 2Loop latency (times)
Aver
age
mar
gin
for
requ
ired
BER
High qualityconnectionLow QualityConnection
1. Adaptability Increases Robustness, Lowers Power
! Difficult connection """" full filter clock speedLow jitter margin " Rx requires fast loop latency
! High-quality connection """" ½ filter clock speedHigher jitter margin " Rx can use low-loop-latency
Communications Research IBM Research
IBM Research – Communications Technology
1. Advantages of Adaptive Links
! AdaptabilityTo application requirements " automatically minimize powerTo environment quality " compensate for variations
! ProductivitySingle design, many applicationsReduced need for worst-case design
! Low overheadLess than 5% area penalty
Communications Research IBM Research
IBM Research – Communications Technology
2. On-Line Problem Determination
! Problem """" Once an �adaptable� device is in the field Unexpected design or manufacturing issues may come upMust understand environment to effectively configure design
! Observation """" Approach must becapable of determining issue origin (channel or link) and causeeasy of use, fast in terms of test and/or correction time
! Solution """" Dual-input on-line problem determinationCombines pattern-based test (1) with analysis of internal signals (2)(1) helps understand system performance (2) helps understand link behavior
Communications Research IBM Research
IBM Research – Communications Technology
Test advisor engine
Result2. On-Line Problem Determination
Data out(in chip)
OversamplingSample memory
Phase control (state machine)
Phase generation
Receive circuitry
Edge detection
CDRLoop
Data select + output
Receiver
Data In (from channel)
BER checker
Environment quality measurement
Loop stats generator
Regulator
Complexity, V, frequency control
Communications Research IBM Research
IBM Research – Communications Technology
2. Test Advisor EnginePRBS result
Advisorengine
PRBS conditionsStatistic1 Statisticn…
rules
BER issue
Link/ channel cause
Correction (optional)
Next test
Next step table
Issue determination table
state
Communications Research IBM Research
IBM Research – Communications Technology
2. Test Advisor Engine (Example)
PRBS 31b
N/ANone<200<5010-12PRBS 7bit
LinkChannel
Link
Application
Channel
None
Issue
Outputs of advisor engineInternal link statsPattern test results
NoneJitter tolerance(L)<200<5010-12JTPATNoneGroup delay (H)<200<5010-8JTPAT
JTPATJitter tolerance(L)<200<5010-10PRBS 31bit
NoneFreq. offset (H)>5000<5010-10PRBS 31bit
NoneH-Freq. jitter (H)<200>7510-10PRBS 31bit
<200
Fr.offset(ppm)
<50
HF Jitter (%UI)
10-12
Max. BER
PRBS 31bit
Pattern
NoneN/A
Next step
Cause (s)
Communications Research IBM Research
IBM Research – Communications Technology
2. Advantages of On-Core Problem Determination
! AdaptabilityTo post-manufacturing environmentHelps understand what part of link to re-configure and how
! ProductivityFast link debuggingLow debugging infrastructure cost
! Low overheadLess than 2% area penalty
Communications Research IBM Research
IBM Research – Communications Technology
3. Core Design and Integration
! Problem: hundreds of high-performance coresHigh performanceLow powerEase of integration
! Solution = voltage islands + selective custom designPerformance: custom techniques + multiple Vth/VddPower: low regulated supplyIntegration: embedded regulation, cell packaging
! Application: realistically complex links (3000+ gates)Performance: no impactPower: 25% savingsIntegration: ASIC methodology, unmodified interfaces
Communications Research IBM Research
IBM Research – Communications Technology
Data and clock extraction
3. Semi-Custom Voltage-Islands
Voltage regulator
Vdde
Receiver
Mediumdata
Recovered data
LSSD shifters
Receiver/
Sampling
Parallel
interface
Clock control
Clockgeneration
Critical logic
Ret
imin
gVddi
Vdde
(shareable)
Communications Research IBM Research
IBM Research – Communications Technology
3. Selective Custom Design Increases Flexibility
Latch
Edge correlation path
Latch
Vddi
AO22
LSSD level shift Manual optimization Multi-Vt merged logic
clk
clkB
A
clk
B
clk
QN
A
scanIN
in
CA
CA
CC
CC
CC
CC
CC
CC
CA
CA
CB
CB
CB
CB
Q
n1n2
p2p1
Communications Research IBM Research
IBM Research – Communications Technology
3. ScannableShifter+Latch
s
i
C
C
C
C
C
CC
CC
C
C
CC
C
Q
LEGENDCACBCCinscaninQ
Clock AClock BClock CInputScan inputOutput
n1n2
p2p1
Communications Research IBM Research
IBM Research – Communications Technology
3. Integration/Customization Saves Power
1.2m
1.0m
0.8m
0.6m
0.4m
0.2m
0.0m23n 23.3n22.7n22.4n
Tota
l ins
tant
aneo
us c
urre
nt
Time
Conventionalapproach
Scannablelevel-shifting latch
Communications Research IBM Research
IBM Research – Communications Technology
0%
10%
20%
30%
40%
0.8 0.9 1 1.1 1.2Supply Voltage
Del
ay p
enal
ty /
Pow
er s
avin
gs Delay penaltyPower savings
3. Using Critical Paths to Choose Supply Voltage
�Optimal�Supply region
Communications Research IBM Research
IBM Research – Communications Technology
3. Flexible Design Methodology
SynthesisCircuit
Integration
Layout
Verification
Place+Route
Extraction Extraction
Extraction
Characterization
Layout Timing Circuit
Logic design
Analog
Custom digital
Communications Research IBM Research
IBM Research – Communications Technology
3. Regulation Improves Robustness
! Reduced impact of supply variationSmaller effective cornersSelectable voltage
Non-regulated
RegulatedEffe
ctiv
e su
pply
Impa
ct (V
)
0.940.96
0.900.92
0.981.00
Communications Research IBM Research
IBM Research – Communications Technology
3. 3.2 Gbit/s 130nm ChipPLL
High powerlogic
Vref & Regulator
Low powerlogic
Communications Research IBM Research
IBM Research – Communications Technology
0%
20%
40%
60%
80%
100%
1.2 1.1 1 0.95 0.9 0.85 0.8
P (m
W)
3. Observed Power Savings
Vddi (V)
Quasi-linear behavior Pessimistic
design
Communications Research IBM Research
IBM Research – Communications Technology
3. Advantages of Semi-Custom Voltage Islands
! AdaptabilityTo voltage supply variations, to manufacturing variationsSupply is digitally selectable and accurately regulated
! ProductivitySelective custom design helps design convergence (25%)Logic can also be selectively shifted to high supply island
! Low overheadCustom design may even reduce area!No impact on system supply distribution
Communications Research IBM Research
IBM Research – Communications Technology
4. Productive Design of Adaptive Link Networks
Switch
Switch
Host
Link Tx Link Rx
Output logic
Recovered data
Sample memory
Sampling latches
Phase rotator control (state machine)
Phase rotator
Edge detection
Sample rate Memory Size Averaging rate
Control rate / flywheelNumber steps
Receive circuitry
Switch
Link Rx Circuit
Switch
HostHost
Trade-off power-BERDetermine architecture modes
Trade-off energy-bandwidthDetermine network
Trade-off power-jitterDetermine circuits
Communications Research IBM Research
IBM Research – Communications Technology
4. Multi-Level Design
! GoalsAdaptability " Flexible architecture definition
Productivity " Fast yet accurate exploration
Performance/power " Trade-off definition
! ApproachRelate BER performance to jitter and then to technology
Enable architecture to be parametrically varied
Allow explicit power-BER-BW goals and trade-offs
Communications Research IBM Research
IBM Research – Communications Technology
4. Jitter (Eye Closure) Determines Performance
Communications Research IBM Research
IBM Research – Communications Technology
4. Goal-Based Multi-Level Design
ManufacturingVariation
PhysicalDimension change Cells, cores
Function/ logicJitter Logic/analog
functions
scanIN
in
CA
CA
CC
CC
CC
CC CC
CCCC
CC C
CQ
n1n2
p2p1CircuitMismatch
CustomcellsA
bstr
actio
n le
vel
Link SystemBit error rate Behavioral blocks, SW
Architectural parameters
Communications Research IBM Research
IBM Research – Communications Technology
4. Types of Jitter
! Random jitterNoise associated to devices (e.g., thermal transistor noise). Phase-Locked-Loops tend to concentrate most of this jitter
! Deterministic jitter Algorithmic and bandwidth limitations
Signal processing algorithm functionality (e.g. CDR filter)Bandwidth limitations of analog circuits
Device mismatch and supply voltage variationRelated to technology " affected by process variations
Communications Research IBM Research
IBM Research – Communications Technology
4. Technology Versus Jitter
! BER can be approximated as function of jitter (J)
! Jitter can be approximated as function of variability (σ)
BER(J) ≈⌡ e
-x2
2(KJ)2
2π(KJ)2
1 ( )dx
v
∞
J ≈ DJ + RJ ≈ σV2 + σM
2 + σN2 + σB
2
Supply variation
Device/wire mismatch
Random/ noise
Algorithmic/ bandwidth
Communications Research IBM Research
IBM Research – Communications Technology
4. Prototyping System Parametric
chip model
Output logic
Input data
Recovered data
Sample memory
Sampling
latches
Phase rotator control (state
machine)
Phase
rotator
Receive circuitry
Edge detectio
nSample rate Averaging rate Function
rate, flywheelNumber steps
Vdd
Receiver Link Parametric Simulation Model
System simulator
Estimation function
Channel/tech. model
Data patterns/type
Design parametersOptimization criteria
BER performance
Power estimator
Area estimator
Link configuration
System designer
Measurement logic Modes/configurable blocks
Semi-custom mixed-signalSystem Implementation
Macromodelrefinement
Communications Research IBM Research
IBM Research – Communications Technology
Summary """" Multi-Level Communications Design
Circuit design
Link design
Custom digitalCustom analog
Self-adaptive links
On-core problem determination
1/2-custom islands
Level Design strategy Adaptability Productivity
Manufacturing variations10% better power
Application, environment50% better power
Post-manufacturingenvironment
Supply variations25% better power
Design convergence
Single design
Debugging
Design convergence
Communications Research IBM Research
IBM Research – Communications Technology
Summary
! Future large computing systems requireAdaptability, productivity, performance
! Networking / communications key to these systemsAnd power-efficiency as important as performance
! Productive, energy-efficient, adaptive networks are possibleAdaptive communications architectures
Intelligent on-core problem determination
Semi-custom multiple-voltage-domain link cores
Multi-level design methodology