Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in
28nm CMOS
!A. Shokrollahi1, D. Carnelli1, J. Fox2, K. Hofstra1, B. Holden1, A. Hormati1, P. Hunt2, M. Johnston1, S. Pesenti1, R. Simpson2,
D. Stauffer1, A. Stewart2, G. Surace2, A. Tajalli1, O. Talebi Amiri1, A. Tschank2, R. Ulrich1, Ch. Walter1, F. Licciardello1,
Y. Mogentale1, A. Singh1
1Kandou Bus, Lausanne, Switzerland,
2Kandou Bus, Northampton, United Kingdom,
1 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Outline
• Introduction and motivation • Signaling • Macro architecture
– Common block – Tx – Rx
• System Implementation • Results • Conclusion
2 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Motivation
• 2.5D integration is a means to connect multiple die in a low-cost package – Increase yield, lower manufacturing costs
• Efficient solution requires high-bandwidth, low-power SerDes
Heterogeneous die in package Lego-like SoC design Interface to off-chip SerDes
3 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Constraints of a Versatile 2.5D Solution
• Want very high throughput using few wires – Enable use of low cost of substrates – Use 20+ Gb/s per signal wire – Need very high bandwidth per mm die-edge
• Requires relatively large distance – Need up to 12 mm, scalability to longer distances
desirable • Demands very low power
– Ideally same as or lower than on-chip links
4 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
2.5D Options
• Need SerDes with – Very high throughput per wire – Very low power at desired channel lengths
Method # wires Distance Power Silicon interposers Many Very short Good Wafer level Processing Many Very short Good SerDes Fewer? Longer? ?
5 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
SerDes: Prior Work Reference [1] [2] [3] This work Year 2009 2012 2014 2016 pJ/bit 1.9 0.54 1.4 0.94 BW/pin (Gb/s) 8.9 20 6 20.83 Technology 45nm 28nm 32nm 28nm Signaling D GRSE D CNRZ-5 Channel loss ≤ 20 dB ≤ 1 dB ≤ 3 dB ≤ 3 dB Substrate SiC MCM Meg6 MCM Reach ≤ 40 mm ≤ 4.5 mm ≤ 19 mm ≤ 12 mm
SiCa=Silicon carrier, SE = single ended, D = differential, GRSE=ground referenced single-ended, MCM = MCM organic substrate, Meg 6 = Megtron 6
6 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Outline
• Introduction and motivation • Signaling • Macro architecture
– Common block – Tx – Rx
• System Implementation • Results • Conclusion
7 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Signaling, Implementation
• Choice of signaling scheme is a major ingredient for realizing performance targets, especially power – Signaling needs to have high pin-efficiency
and built-in robustness to noise (common mode, SSO, ISI, EMI, etc.)
8 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
-+
-+
-+
-+
-+
Enco
der
w0
w1
b0
b1
b0
b1
Rx fr
onte
nd
Com
para
tors
Chord Signaling
• Correlated signals on wires, matched comparator network – Signals on wires belong to a codebook
• Chordal code = codebook + comparators
Codeword from codebook
Input bits
Reconstructed bits No digital decoder
9 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Chord Signaling
• Correct design of code – Increases throughput per pin – Reduces ISI – Eliminates SSO noise – Eliminates common mode noise – Reduces EMI noise
10 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5
• Transmits 5 bits on a collection of 6 wires in every UI.
• Codewords are judiciously chosen permutations of [+1,-1,+1/3,+1/3,-1/3,-1/3].
• Five comparators: – Two compare one wire value against another, – Two compare average of two wire values against a
third, – One compares average of three wire values against
the other three.
11 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5
av
av
av
av
Enco
der
Driversw0
w1
w2
w3
w4
w5
b0
b1
b2
b3
b4
-+ b0
-+ b1
-+ b2
-+ b3
-+ b4
Rx fr
onte
nd
12 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5
av
av
av
av
Enco
der
Driversw0
w1
w2
w3
w4
w5
b0
b1
b2
b3
b4
-+ b0
-+ b1
-+ b2
-+ b3
-+ b4
Rx fr
onte
nd
Digital encoder (logic depth 2)
13 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5
av
av
av
av
Enco
der
Driversw0
w1
w2
w3
w4
w5
b0
b1
b2
b3
b4
-+ b0
-+ b1
-+ b2
-+ b3
-+ b4
Rx fr
onte
nd
Digital encoder (logic depth 2)
Quaternary values on wires
14 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5
av
av
av
av
Enco
der
Driversw0
w1
w2
w3
w4
w5
b0
b1
b2
b3
b4
-+ b0
-+ b1
-+ b2
-+ b3
-+ b4
Rx fr
onte
nd
Digital encoder (logic depth 2)
Quaternary values on wires
Binary values at input to slicers
15 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5 Codebook [1/3,!'1/3,!'1,!'1/3,!1/3,!1]![1,!1/3,!'1/3,!'1,!'1/3,!1/3]!['1/3,!'1,!1/3,!'1/3,!1/3,!1]![1/3,!'1/3,!1,!'1,!'1/3,!1/3]!['1/3,!1/3,!'1,!'1/3,!1/3,!1]![1/3,!1,!'1/3,!'1,!'1/3,!1/3]!['1,!'1/3,!1/3,!'1/3,!1/3,!1]!['1/3,!1/3,!1,!'1,!'1/3,!1/3]![1/3,!'1/3,!'1,!1,!'1/3,!1/3]![1,!1/3,!'1/3,!1/3,!'1,!'1/3]!['1/3,!'1,!1/3,!1,!'1/3,!1/3]![1/3,!'1/3,!1,!1/3,!'1,!'1/3]!['1/3,!1/3,!'1,!1,!'1/3,!1/3]![1/3,!1,!'1/3,!1/3,!'1,!'1/3]!['1,!'1/3,!1/3,!1,!'1/3,!1/3]!['1/3,!1/3,!1,!1/3,!'1,!'1/3]!
[1/3,!'1/3,!'1,!'1/3,!1,!1/3]![1,!1/3,!'1/3,!'1,!1/3,!'1/3]!['1/3,!'1,!1/3,!'1/3,!1,!1/3]![1/3,!'1/3,!1,!'1,!1/3,!'1/3]!['1/3,!1/3,!'1,!'1/3,!1,!1/3]![1/3,!1,!'1/3,!'1,!1/3,!'1/3]!['1,!'1/3,!1/3,!'1/3,!1,!1/3]!['1/3,!1/3,!1,!'1,!1/3,!'1/3]![1/3,!'1/3,!'1,!1,!1/3,!'1/3]![1,!1/3,!'1/3,!1/3,!'1/3,!'1]!['1/3,!'1,!1/3,!1,!1/3,!'1/3]![1/3,!'1/3,!1,!1/3,!'1/3,!'1]!['1/3,!1/3,!'1,!1,!1/3,!'1/3]![1/3,!1,!'1/3,!1/3,!'1/3,!'1]!['1,!'1/3,!1/3,!1,!1/3,!'1/3]!['1/3,!1/3,!1,!1/3,!'1/3,!'1]!
16 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
CNRZ-5 Encoder
• Computes two bits Wi[0], Wi[1], i=0..5 from input bits b0,..,b4, which are used by Tx output driver to create codeword values w0,..,w5 on wires
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND
NAND NAND
NOT
NOT
NOT
NOT
NOT
NOT
XORXOR
NAND
NOT
b4 b3
b0
b2 b1
W5[1] W3[1]
W3[0] W5[0] W4[0]
W4[1]
W0[1] W2[1]
W2[0] W0[0] W1[0]
W1[1]
17 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Binary Values at Input to Slicers • Core concept is that of ISI-ratio [4], [5]. • This property massively reduces ISI noise
[1/3,!'1/3,!'1,!!'1/3,!1/3,!1!!!]![1,!1/3,!'1/3,!!!'1,!'1/3,!1/3!!]!['1/3,!'1,!1/3,!!'1/3,!1/3,!1!!!]![1/3,!'1/3,!1,!!!'1,!'1/3,!1/3!!]!['1/3,!1/3,!'1,!!'1/3,!1/3,!1!!!]![1/3,!1,!'1/3,!!!'1,!'1/3,!1/3!!]!['1,!'1/3,!1/3,!!'1/3,!1/3,!1!!!]!['1/3,!1/3,!1,!!!!'1,!'1/3,!1/3!]![1/3,!'1/3,!'1,!!!!1,!'1/3,!1/3!]![1,!1/3,!'1/3,!!!!!1/3,!'1,!'1/3]!['1/3,!'1,!1/3,!!!!1,!'1/3,!1/3!]![1/3,!'1/3,!1,!!!!!1/3,!'1,!'1/3]!['1/3,!1/3,!'1,!!!!1,!'1/3,!1/3!]![1/3,!1,!'1/3,!!!!!1/3,!'1,!'1/3]!['1,!'1/3,!1/3,!!!!1,!'1/3,!1/3!]!['1/3,!1/3,!1,!!!!!1/3,!'1,!'1/3]!
'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!
!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!
[1/3,!'1/3,!'1,!!!!!'1/3,!1,!1/3]![1,!1/3,!'1/3,!!!!!'1,!1/3,!'1/3]!['1/3,!'1,!1/3,!!!!!'1/3,!1,!1/3]![1/3,!'1/3,!1,!!!!!'1,!1/3,!'1/3]!['1/3,!1/3,!'1,!!!!!'1/3,!1,!1/3]![1/3,!1,!'1/3,!!!!!'1,!1/3,!'1/3]!['1,!'1/3,!1/3,!!!!!'1/3,!1,!1/3]!['1/3,!1/3,!1,!!!!!'1,!1/3,!'1/3]![1/3,!'1/3,!'1,!!!!!1,!1/3,!'1/3]![1,!1/3,!'1/3,!!!!!1/3,!'1/3,!'1]!['1/3,!'1,!1/3,!!!!!1,!1/3,!'1/3]![1/3,!'1/3,!1,!!!!!1/3,!'1/3,!'1]!['1/3,!1/3,!'1,!!!!!1,!1/3,!'1/3]![1/3,!1,!'1/3,!!!!!1/3,!'1/3,!'1]!['1,!'1/3,!1/3,!!!!!1,!1/3,!'1/3]!['1/3,!1/3,!1,!!!!!1/3,!'1/3,!'1]!
!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!
'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!'1/3!!1/3!
average average average average
18 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Signal Integrity through Coding CM Noise EMI Crosstalk
• Comparators designed to tolerate common mode noise
• Balanced values across wires
Codewords designed to minimize EM-field
• Code designed to tolerate some crosstalk
• Complemented by sensible design
SSO Noise ISI Reference-less
Code designed so that all codewords draw same current from source
Values at input to slicers are binary, even though wire values are quaternary
• Comparators are self-referencing and binary
• No need for creating levels
19 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Outline
• Introduction and motivation • Signaling • Macro architecture
– Common block – Tx – Rx
• System Implementation • Results • Conclusion
20 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Macro Architecture
21 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Macro Architecture
Common Block, CmIP
TxIP RxIP
22 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Macro Architecture
Common Block, CmIP
TxIP RxIP
5b data @ 25 Gb/s, encoded over 6 wires, 8UI transmitted diff fwd
clk
32UI core side data & clk (Tx)
23 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Macro Architecture
Common Block, CmIP
TxIP RxIP
5b data @ 25 Gb/s, 8UI received
fwd clk
32UI core side data & clk (Rx)
24 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Common Block
• CmIP consists of: – Main PLL
• Ring Oscillator, produces 2 phase 8UI clock (3.125GHz)
• Programmable to cover half rate to full rate links (25Gb/s to 12Gb/s)
– Bandgap – Temperature sensor
25 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter
26 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
Transmitter
27 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
digital encoder
32UI core side data
Serializer (32-to-1)
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
Transmitter
28 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter
Series L
ESD diodes
Common mode, vdd/2
digital encoder
32UI core side data
Serializer (32-to-1)
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
29 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter
Series L
ESD diodes
Common mode, vdd/2
digital encoder
32UI core side data
Serializer (32-to-1)
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
30 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter
Tx mini-PLL Ref clk
from main PLL,
3.125GHz
test clk
Series L
ESD diodes
Common mode, vdd/2
digital encoder
32UI core side data
Serializer (32-to-1)
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
31 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter
SST driver, 75ohms
diff fwd clk, 3.125GHz (8UI)
Tx mini-PLL Ref clk
from main PLL,
3.125GHz
test clk
Series L
ESD diodes
Common mode, vdd/2
digital encoder
32UI core side data
Serializer (32-to-1)
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
32 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter
Matched clk & data path for jitter
tracking
SST driver, 75ohms
diff fwd clk, 3.125GHz (8UI)
Tx mini-PLL Ref clk
from main PLL,
3.125GHz
test clk
Series L
ESD diodes
Common mode, vdd/2
digital encoder
32UI core side data
Serializer (32-to-1)
SST driver,
75ohms
4-phase, 6.25GHz clk
6 wires, 25Gb/s
33 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter Blocks
• SST output driver
34 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter Blocks
• SST output driver – 4 levels, 300mVpp
35 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter Blocks
• SST output driver – 4 levels, 300mVpp – Balanced, SSO free
36 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter Blocks
• SST output driver – 4 levels, 300mVpp – Balanced, SSO free – 75 ohms
• 125Ω || 190Ω
– Vcm=Vdd/2 – ¼ rate arch
375Ω
375Ω
375Ω 190Ω
Vcm
37 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Transmitter Blocks
• SST output driver – 4 levels, 300mVpp – Balanced, SSO free – 75 ohms
• 125Ω || 190Ω
– Vcm=Vdd/2 – ¼ rate arch
• CMOS output in JTAG/test mode
JTAG input
375Ω
375Ω
375Ω 190Ω
Vcm
38 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver
39 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver • Continuous Time Front End:
– DC coupled, with level shifter – T-Coils for passive equalization – Multi-input gain stage (decoder) that performs linear combination of
incoming signal
40 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver • Continuous Time Front End:
– DC coupled, with level shifter – T-Coils for passive equalization – Multi-input gain stage (decoder) that performs linear combination of
incoming signal
• Discrete Time Front End: – 4-ph data sampling system, followed by 4:32 demux
41 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver • Continuous Time Front End:
– DC coupled, with level shifter – T-Coils for passive equalization – Multi-input gain stage (decoder) that performs linear combination of
incoming signal
• Discrete Time Front End: – 4-ph data sampling system, followed by 4:32 demux
• Clock data alignment block (CDA) is used to align sampling clock edge to center of data eye
42 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver • Continuous Time Front End:
– DC coupled, with level shifter – T-Coils for passive equalization – Multi-input gain stage (decoder) that performs linear combination of
incoming signal
• Discrete Time Front End: – 4-ph data sampling system, followed by 4:32 demux
• Clock data alignment block (CDA) is used to align sampling clock edge to center of data eye
• Wide-band PLL to track correlated jitter (data/clk) – Uses 8UI fwd clock as ref clock
43 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams
44 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams
45 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
Receiver Block Diagrams
46 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
lvl shifter, ESD,
T-Coils
Receiver Block Diagrams per wire front
end (6) sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
47 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
MIC
binary output (diff)
Receiver Block Diagrams
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
48 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
MIC
Receiver Block Diagrams
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
49 of 76
Mission mode Loopback path
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
sampler
Receiver Block Diagrams
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
50 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
demux
mini-PLL
Receiver Block Diagrams
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
51 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
dig block
fwd clk 3.125GHz
ref clk
Receiver Block Diagrams demux
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
mini-PLL
52 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
4ph, 4UI sampling clk
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
ref clk
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
53 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
matched front end
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
54 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
55 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
CDA (clock data aligner)
56 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
CDA (clock data aligner)
matched front end
matched data path
57 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
CDA sampler CDA votes
CDA sampling clk is shifted by 45°
from data sampling clk
58 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
CDA sampler CDA votes
CDA sampling clk is shifted by 45°
from data sampling clk
control PI to align CDA sampling clk to track fwd clk edge
59 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Receiver Block Diagrams dig
block
fwd clk 3.125GHz
demux
mini-PLL
MIC
lvl shifter, ESD,
T-Coils
per wire front end (6)
sub-channels (5)
Rx input 6 wires, 5b @25Gbaud
160b data 32UI clk
ref clk
CDA sampler CDA votes
CDA sampling clk is shifted by 45°
from data sampling clk
control PI to align CDA sampling clk to track fwd clk edge
sampler clk is aligned to
middle of data eye
60 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Outline
• Introduction and motivation • Signaling • Macro architecture
– Common block – Tx – Rx
• System Implementation • Results • Conclusion
61 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Bump Map and Technology
Process:!• TSMC!28nm!HPM!• Metal!stack!is!9!layer:!6X2R!+!
1!UTALRDL!• DGO!process!(dual!gate!
oxide,!1.0V!and!1.8V!devices).!
• Devices!used:!HVT,!LVT!and!nominal!VT!(3!types)!
Tx!signal!bumps Rx!signal!bumps
62 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Bump Map and Technology
Process:!• TSMC!28nm!HPM!• Metal!stack!is!9!layer:!6X2R!+!
1!UTALRDL!• DGO!process!(dual!gate!
oxide,!1.0V!and!1.8V!devices).!
• Devices!used:!HVT,!LVT!and!nominal!VT!(3!types)!
vss!–!ground!
vdda!'!Rx/PLL!analog!!
!!!!power!(1.0!V)!
vddh!–!PLL!power!supply!!
!!!!!(1.5V'1.8V)!
vddd!–!macro!digital!!
!!!power!supply!!(1.0V)!
Anabus!signal!
Tx!FCLK!
Tx!signals!
Rx!FCLK!
Rx!signals!
Unused
63 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Testchip
• Architecture: – One instance of CNRZ-5 IP (Tx + Rx) – One common block – 62.5 Gb/s and 125 Gb/s over six wires
• Die size (actual without scribe) 2138.4µm x 1386.9µm (2.96sqmm)
VDD!
VDD!
VSS!
CLKREFP!
CLKREFN!
VDD!
VSS!
VSS!
VDD!
VDD!
VDDS!
TEMPA!
TEMPB!
VSS!
VSS!
CLKOB!
NOISE
CLK!
MOSI!
SSB!
SCLK!
MISO
!Pat
Pat
Pat
Pat
Pat
FIFO
Noise,Gen
Pat
Pat
Pat
Pat
Pat
Reset SPI Ctrl
GW2821252USR,Macro0
Ref,C
lk
TempSense
XBAR
Tx,Clkn/p Tx,<5:0> Rx,<5:0> Rx,Clkn/p
64 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Chip Micrograph
65 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Features
Technology 28nm CMOS HPM, VDD=1.0V, 9M, DGO MCM Channels Losses (s21) of ~0.6dB, ~1.25dB and ~2.5dB Data Rate 10.44-20.83 Gb/s/wire (12.5-25 Gbaud) Power and Energy Efficiency 117.5 mW at 125 Gbps BER < 1e-15 at 25 Gbaud
Testability
IP: Internal loopback; Rx Eyescope; PRBS31 Pattgen and Verification; Analog test bus. Testchip: Pattern generators; Noise Generators.
66 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Demo MCM • Organic GX13 substrate, 222 stack up. • 19mm square, 18 x 18 BGA, 1mm pitch. • Populate up to 4 GW die interconnected
in pairs. • Provides up to 3 channels of 5mm,
12mm and 24mm. • Channel trace length matched to ~1µm. • Channel losses (s21) of ~0.6dB,
~1.25dB and ~2.5dB. • CNRZ-5 channels, 50Ohms:
• Trace width = 19.5µm • Space = 55.5µm
• Two die provide break out to Rx and Tx 6-wire interfaces plus FCLK respectively.
TC0
TC1
TC3
TC2
12 m
m
5 m
m
24 m
m
67 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Evaluation Board
• MCM body size 19mm, 18 x 18 BGA
• 25 Gb/s HS signals off-MCM to PCB and H+S connectors from TC2 and TC3
• Single PCB design capable of accepting a socket and soldered MCMs
MCM is solder ball assembled onto a host PCB. Solder ball array 1mm pitch fully populated. Organic substrate.
H+S Connectors. Type H&S MXP
68 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Outline
• Introduction and motivation • Signaling • Macro architecture
– Common block – Tx – Rx
• System Implementation • Results • Conclusion
69 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Results
70 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Measured Power at 125 Gb/s V mA mW
CMIP VDDA 0.925 2.253 2.084 1.77%
8.48% VDDH 1.400 5.616 7.862 6.69%
VDDD 0.800 0.032 0.026 0.02%
TXIP VDDA 0.925 54.927 50.807 43.20%
54.17% VDDH 1.400 3.996 5.594 4.76%
VDDD 0.800 9.125 7.300 6.21%
RXIP VDDA 0.925 31.152 28.816 24.50%
37.35% VDDH 1.400 8.177 11.448 9.73%
VDDD 0.800 4.584 3.667 3.12%
Ptotal [mW] 117.605 Rate [Gb/s] 125
Ebit [pJ/b] 0.941
71 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Results
Vdda = 0.925V Vddd = 0.8V Vddh = 1.4V
Vdda = 1V Vddd = 1V Vddh = 1.5V
0.94 pJ/b 1.04 pJ/b
72 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Results (Continued) • Target BER of 1e-15 achieved at 25 Gbaud
– Specified 12mm MCM channel across power supply corners and temperature stress
• Target BER of 1e-15 achieved at half data rate 12.5 Gbaud on 24 mm channel
• Target BER achieved under stress test (±5% supply tolerance, temp range from 0 to 100 C)
• Target BER achieved at 30 Gbaud on 12mm channel under nominal power supplies and temperature
73 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Conclusion • Demonstrated an implementation of a SerDes
based on CNRZ-5 coding, transmitting 5 bits on 6 correlated wires in every UI
• Coding has built-in resilience to various types of noise which helps lower the power consumption at high speeds
• Implementation uses a forwarded clock architecture and transmits up to 125 Gb/s on 6 wires at 0.94 pJ/bit
74 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
References [1] Kim et al., “A 10 Gb/s compact low-power serial I/O with DFE-IIR equalization
in 65 nm CMOS”, IEEE Journal of Solid State Circuits, vol. 44, pp. 3526-3538, 2009.
[2] Poulton et al., “A 0.54pJ/b 20 Gb/s Ground-Referenced Single-Ended Short-Haul Serial Link in 28nm CMOS for Advanced Packaging Applications,” IEEE Journal of Solid State Circuits, vol. 48, pp. 3207-3218, 2013.
[3] Dickson et al., “A 1.4 pJ/bit, power scalable 16x12 Gb/s source-synchronous I/O with DFE receiver in 32 nm SOI CMOS technology,” in Proc. IEEE Custom Integrated Circuits Conf., 2014, pp. 10-5.
[4] Hormati et al., “Method and Apparatus for Low-Power Chip-to-Chip Communications with Constrained ISI-Ratio”, U.S. Patent 9,100,232.
[5] A. Hormati and A. Shokrollahi, “ISI tolerant signaling: a comparative study of PAM4 and ENRZ,” DesignCon 2016.
75 of 76
Paper 10.1: A Pin- and Power-Efficient 20.83Gb/s/wire 0.94pJ/bit Forwarded Clock CNRZ-5 Coded SerDes up to 12mm for MCM Packages in 28nm CMOS
© 2016 IEEE International Solid-State Circuits Conference
Thank you
76 of 76