Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Design of Serial Link Transceivers
Guanghua ShuAdvisor: Pavan Kumar Hanumolu
University of Illinois at Urbana-Champaign
2014-04-14
Outline• Serial link transceiver overview
• Transmitter (brief)
– Output driver
• Receiver
– Clock and data recovery (focus)
– Equalizer (brief)
2
Overview of serial link systems
Serial Link Systems
Optical links(OC-1, OC-192, ...)[Broadcom]
[Intel]
[Finisar]
3
Serial link transceiver (SerDes)
• Transmitter (Tx)– Serialize parallel input to 1-bit data stream and transmit– TxPLL provides clocks for serializer
• Receiver (Rx)– CDR recovers clock and data – Deserialize 1-bit data stream to parallel data
Clock recovery
Transmitter Receiver
ChannelDTX DRX
100101 100101Serializer
TxPLL
Driver D Q
Tx equalizerRx equalizer
|H(ω)|
ω
DFE
CKRX
CKTX
|Hch(ω)|
ω
4
Serial links in SoC
• About 50 SerDes IPs– Single IP power & area
– Integrated with SoC
– Portable with SoC
[J. Hart, ISSCC13, Oracle]
SerDes
MI/OSerDes
PCIe Gen3 SerDes
SerDes
SerDes
SerDes SerDes
SerDesSerDes MI/O
5
Electrical link standards
• SoC cover multiple standards– Wide range serial link transceivers
– Shorten design cycle to reduce time to market
USB 1.0
USB2.0
IEEE1394
SATA1.0
PCIe1.0
PCIe2.0
GDDR4
USB3.0
SATA3.0
PCIe3.0
GDDR5
10G Ethernet
PCIe 4.0(expected 2015)
SATA2.0
6
Link design tradeoff & wish list
• Low power
• Small area
• Low jitter (good BER)
• Fully integrated
• Friendly portable
Power
TechnologyJitter
Bit error rate
Supply voltage Area
Operation range
6
• Wide range
Outline• Serial link transceiver overview
• Transmitter
– Output driver
• Receiver
– Clock and data recovery
8
Transmitter signaling: NRZ or RZ
Tb1
Tb2
Tb3
Tb1
Tb2
Tb3
SNRZ(f) SRZ(f)
f f
1 0 0 1NRZRZ
1 unit interval (UI) =Tb seconds
9
Speed: FO4
• Speed increases as technology scales and CMOS circuit delays scale proportionally
• FO4 delay = inverter delay with fan out of 4
• tFO4 is an empirical technology limit for HSSL
tFO4≈30ps in 90nm CMOS
10
• Match TX output impedance to 50Ω to minimize reflections.
• Two typical types of Tx driver (detail later)
– Current mode (CM) drivers
– Voltage mode (VM) drivers
Transmitter (TX driver)
[S. Palermo, ECEN720, TAMU] 11
Current mode driver • Widely used in high performance serial links.
• Differential peak-peak voltage swing Vswpp=IR
[S. Palermo, ECEN720, TAMU]
12
Voltage mode driver
• CMOS structure used for high swing applications
• NMOS structure used in low swing
[S. Palermo, ECEN720, TAMU]
13
Voltage mode vs. current mode• CM uses Norton-equivalent parallel terminations
thus easier to control output impedance.
• VM uses Thevenin-equivalent series so ½ to ¼ of the current spent in trying achieve same output swing compared to CM drivers.
[S. Palermo, ECEN720, TAMU]
14
Outline• Serial link transceiver overview
• Transmitter
– Output driver
• Receiver
– Clock and data recovery
– Equalizer (brief)
15
Link classification (1)
• Based on how receiver clock is generated16
Link classification (2)
• Number of data bits per clock cycle
• Need multiple phases for half rate and quarter rate
DIN
CK
DIN
CK
DIN
CK
Full rate
Half rate
Quarter rate
17
Timing recovery
[Razavi, 2002][a] Simple CDR circuit
[b] Forwarded clock [c] Embedded clock
DIN
CK Optimum
DIN
CK
D Q
De-skewCKIN
DIN DRX D Q
Timing recovery
DIN DRX
CK CK
18
Classical analog CDR
2(s 1)(s)(s)
1 (s)1 1
CKRX PD VCO
DIN PD VCO PD VCO
z pL pH PD VCO
K K RCLGLG s C sK K RC K K
K K RRC RC
ω ω ω
Φ += =
Φ + + +
= − ≈ − ≈ −
KPDΦDIN ΦE R
KVCOs1
sC
ΦCKRXIE
Model
PDVCO
DRXDIN
ΦE
CDRRC
LF(s)UP
DN
CKRXIE
ω ωpH ωpL ωz
ΦCKRX ΦDIN
• 3dB bandwidth close to ωpH
19
Performance metrics for CDR• Jitter transfer (JTRAN)
• Jitter tolerance (JTOL)
• Jitter generation (JGEN)
JTRAN JTOLPeaking
[K. Kundert, 2010]
20
Hogge phase detector• Phase difference detection
• Data transition detection
CK
A
X(b)
[Razavi, 2002]
21
Sense amplifier
[J. Kim 2009]
• 4 operating phases: reset, sampling, regeneration, and decision
22
Flip-flop (SA+RS latch)
• RS latch holds data
during SA reset
• Check reference for improved version
RS Latch
[Nickolic 2000]
Sb Rb Q Qb
0 0 1 1
0 1 1 0
1 0 0 1
1 1 Hold
Truth table for RS latch
23
Charge pump & loop filter
[Arakali 2009]
• Chose R, and IUP(IDN) based on noise and bandwidth
(typical: R several kΩs, IUP (IDN) tens of µA)
• More order can be added to loop filter
UP
DN
RC1
VC
IB IUP
IDN
24
Area issue in analog CDR
Small peaking large C (nF) large area
2(s 1)(s)(s)
1 (s)1 1
CKRX PD VCO
DIN PD VCO PD VCO
z pL pH PD VCO
K K RCLGLG s C sK K RC K K
K K RRC RC
ω ω ω
Φ += =
Φ + + +
= − ≈ − ≈ −
Peaking
ω ωpH ωpL ωz
ΦCKRX ΦDIN
KPDΦDIN ΦE R
KVCOs1
sC
ΦCKRXIE
Model
PDVCO
DRXDIN
ΦE
CDRRC
LF(s)UP
DN
CKRXIE
25
Analog CDR to digital CDR
• Large capacitor is substituted by accumulator
• PVT insensitive, easy to configure loop dynamics
DPDDCODIN
ΦE
DRX
KI
z-1
KP
DigitalCDR
ACCKI
z-1
KPCKRX
Proportional control
Integral control
DE
LF(z)
+1,0,−1
ACC
DE
APDVCO
DRXDIN
ΦE Analog
CDR
RC
CKRX
LF(s)UP
DN
IE R
C
IE
26
Early/Late definition
[Razavi, 2002]• If S1 ⊕ S2 is high and S2 ⊕ S3 is low, the clock is Early
• If S1 ⊕ S2 is low and S2 ⊕ S3 is high, the clock is Late
• If S1 ⊕ S2 = S2 ⊕ S3, no data transition is present
27
Alexander phase detector
• Y = S2 ⊕ S3
• X = S1 ⊕ S2
• Also called bang-bang phase detector
[Razavi, 2002]X Y E L0 0 0 0
0 1 0 1
1 0 1 0
1 1 Undefined 28
Digitally controlled oscillator
KI
z-1
KP
DE +1,0,−1ACC
DINDAC
VDCO
20CKRX
DCO
• Ideal digital to analog converter model is available in Cadence library (ahdlLib/dac_8bit_ideal)
29
Mitigation of quantization noise
• Power
• Area
KI
z-1
KP
DE +1,0,−1ACC
DIN DOUT
ΔΣ modulator
DOUTDIN
20 417
DACVDCO
20 4
z-1
2nd order∆Σ modulator
NTF=(1-z-1)2z-1
2
−21
KI
z-1
KP
DE +1,0,−1ACC
DINDAC
VDCO
20CKRX
DCO
[Yin, 2011]
30
Other CDR architectures
• VCO-base single loop • PI based dual loop• Open loop• …• Check reference [7] for more
31
Dual loop CDR
PLL
PFD VCOLF
÷N
ΦINTBBPD DLFDIN
DPCK
CDR
[Hanumolu, 2008]
• Cascade of PLL and CDR loop
• PLL provides frequency
• CDR for phase lock
• Phase interpolator (ΦINT) interpolate btwn input phases
32
Outline• Serial link transceiver overview
• Transmitter (brief)
– Output driver
• Receiver
– Clock and data recovery
– Equalizer (brief)
33
Why equalizer?
Channel
|Hch(ω )|
ω Vin Vout
EqualizerVeq
|Heq(ω)|
ω
0dB0dB
200ps 200ps 200ps
• Equalizer can be done in both Tx side and Rx side
34
TX equalizers
[Stauffer, 2008]
• Typical channel response with TX equalization in form of pre-emphasis FFE (Feed-Forward Equalizer)
• 3-Tap FIR FFE Architecture:a. Taps selected to generatetransfer function characteristicthat is the inverse of channeltransfer function
b. Peak power constraint35
Rx equalizers
• CT: difficult to get enough peaking at high frequency
• DT: timing is stringent at high data rate (loop unrolling helps)
• Configurable across standards
VIN VIP
ω
VO
|H(ω)|
2RZ
CZ/2
ADC
RL CL
1RLCL
1+gmRZ
RZCZ
1RZCZ
(a) Continuous time linear equalizer
(CTLE)
Σ D QVIN
CK
DRX
α
Tsetup
TCKQ
TFBTsetup+TCKQ+TFB<1UI
(b) Decision feedback equalizer (DFE)
36
Multilevel signaling
0
1
00011110
PAM2(NRZ) PAM4
0
1
Nyquist Frequency [GHz]0.0 1.0 2.0 3.0 4.0
|H(ω)|[dB]
-20
-40
-60
0
6.4Gb/s 6.4Gb/sPAM2PAM4
Ch. 1
Ch. 2
• PAM4 reduces Nyquist frequency by 2, and also reduces amplitude to 1/3 (20log10(3)≈9.54dB)
• For loss difference >9.54dB between two Nyquist frequencies, PAM4 can be beneficial
37
Summary
• Serial link transceiver overview
• Transmitter (brief)
– Output driver
• Receiver
– Clock and data recovery (focus)
– Equalizer (brief)
38
Reference[1] J. Hart, S. Butler, et al., "3.6GHz 16-core SPARC SoC processor in 28nm," in IEEE ISSCC Dig. Tech. Papers, Feb. 2013, pp. 17–21. [2] S. Palermo, ECEN 720 Lecture 6, Texas A&M University[3] B. Razavi, “Challenges in the Design of High-Speed Clock and Data Recovery Circuits,” IEEE Communications Magazine pp. 94–101, August, 2002.[4] Jaeha Kim, Brian S. Leibowitz, Jihong Ren, Christopher J. Madden, "Simulation and Analysis of Random Decision Errors in Clocked Comparators," IEEE Trans. Circuits and Systems-I, pp. 1844-1857, Aug. 2009.[5] A. Arakali, S. Gondi, and P. K. Hanumolu, “Analysis and Design Techniques for Supply-Noise Mitigation in Phase-Locked Loops,” Circuits Syst. I Regul. Pap. IEEE Trans., vol. 57, no. 11, pp. 2880–2889, Nov. 2010.[6] W. Yin, R. Inti, A. Elshazly, N. Sasidhar, and P. K. Hanumolu, “A TDC-Less 7 mW 2.5 Gb/s Digital CDR With Linear Loop Dynamics and Offset-Free Data Recovery,” IEEE J. Solid-State Circuits, vol. 46, no. 12, pp. 3163–3173, 2011.[7] M. Hsieh, “Architectures for Multi-Gigabit Wire-Linked Clock and Data Recovery,” IEEE Circuits and Systems Magazine pp. 45-57, September, 2008[8] Pavan Kumar Hanumolu; Gu-Yeon Wei; and Un-Ku Moon, “A Wide-Tracking Range Clock and Data,” Solid-State Circuits, IEEE J., vol. 43, no. 2, pp. 425–439, 2008.[9] K. Kundert, “Verification of Bit-Error Rate in Bang-Bang Clock and Data Recovery Circuits,”Designer’s Guide Publishing, 2010. Available from www.designersguide.org/Analysis.
39
Acknowledgment
• Prof. Pavan K. Hanumolu• Prof. José E. Schutt-Ainé• Rishi Ratan
40