40
Design of Serial Link Transceivers Guanghua Shu Advisor: Pavan Kumar Hanumolu University of Illinois at Urbana-Champaign 2014-04-14

Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Design of Serial Link Transceivers

Guanghua ShuAdvisor: Pavan Kumar Hanumolu

University of Illinois at Urbana-Champaign

2014-04-14

Page 2: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Outline• Serial link transceiver overview

• Transmitter (brief)

– Output driver

• Receiver

– Clock and data recovery (focus)

– Equalizer (brief)

2

Page 3: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Overview of serial link systems

Serial Link Systems

Optical links(OC-1, OC-192, ...)[Broadcom]

[Intel]

[Finisar]

3

Page 4: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Serial link transceiver (SerDes)

• Transmitter (Tx)– Serialize parallel input to 1-bit data stream and transmit– TxPLL provides clocks for serializer

• Receiver (Rx)– CDR recovers clock and data – Deserialize 1-bit data stream to parallel data

Clock recovery

Transmitter Receiver

ChannelDTX DRX

100101 100101Serializer

TxPLL

Driver D Q

Tx equalizerRx equalizer

|H(ω)|

ω

DFE

CKRX

CKTX

|Hch(ω)|

ω

4

Page 5: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Serial links in SoC

• About 50 SerDes IPs– Single IP power & area

– Integrated with SoC

– Portable with SoC

[J. Hart, ISSCC13, Oracle]

SerDes

MI/OSerDes

PCIe Gen3 SerDes

SerDes

SerDes

SerDes SerDes

SerDesSerDes MI/O

5

Page 6: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Electrical link standards

• SoC cover multiple standards– Wide range serial link transceivers

– Shorten design cycle to reduce time to market

USB 1.0

USB2.0

IEEE1394

SATA1.0

PCIe1.0

PCIe2.0

GDDR4

USB3.0

SATA3.0

PCIe3.0

GDDR5

10G Ethernet

PCIe 4.0(expected 2015)

SATA2.0

6

Page 7: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Link design tradeoff & wish list

• Low power

• Small area

• Low jitter (good BER)

• Fully integrated

• Friendly portable

Power

TechnologyJitter

Bit error rate

Supply voltage Area

Operation range

6

• Wide range

Page 8: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Outline• Serial link transceiver overview

• Transmitter

– Output driver

• Receiver

– Clock and data recovery

8

Page 9: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Transmitter signaling: NRZ or RZ

Tb1

Tb2

Tb3

Tb1

Tb2

Tb3

SNRZ(f) SRZ(f)

f f

1 0 0 1NRZRZ

1 unit interval (UI) =Tb seconds

9

Page 10: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Speed: FO4

• Speed increases as technology scales and CMOS circuit delays scale proportionally

• FO4 delay = inverter delay with fan out of 4

• tFO4 is an empirical technology limit for HSSL

tFO4≈30ps in 90nm CMOS

10

Page 11: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

• Match TX output impedance to 50Ω to minimize reflections.

• Two typical types of Tx driver (detail later)

– Current mode (CM) drivers

– Voltage mode (VM) drivers

Transmitter (TX driver)

[S. Palermo, ECEN720, TAMU] 11

Page 12: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Current mode driver • Widely used in high performance serial links.

• Differential peak-peak voltage swing Vswpp=IR

[S. Palermo, ECEN720, TAMU]

12

Page 13: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Voltage mode driver

• CMOS structure used for high swing applications

• NMOS structure used in low swing

[S. Palermo, ECEN720, TAMU]

13

Page 14: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Voltage mode vs. current mode• CM uses Norton-equivalent parallel terminations

thus easier to control output impedance.

• VM uses Thevenin-equivalent series so ½ to ¼ of the current spent in trying achieve same output swing compared to CM drivers.

[S. Palermo, ECEN720, TAMU]

14

Page 15: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Outline• Serial link transceiver overview

• Transmitter

– Output driver

• Receiver

– Clock and data recovery

– Equalizer (brief)

15

Page 16: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Link classification (1)

• Based on how receiver clock is generated16

Page 17: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Link classification (2)

• Number of data bits per clock cycle

• Need multiple phases for half rate and quarter rate

DIN

CK

DIN

CK

DIN

CK

Full rate

Half rate

Quarter rate

17

Page 18: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Timing recovery

[Razavi, 2002][a] Simple CDR circuit

[b] Forwarded clock [c] Embedded clock

DIN

CK Optimum

DIN

CK

D Q

De-skewCKIN

DIN DRX D Q

Timing recovery

DIN DRX

CK CK

18

Page 19: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Classical analog CDR

2(s 1)(s)(s)

1 (s)1 1

CKRX PD VCO

DIN PD VCO PD VCO

z pL pH PD VCO

K K RCLGLG s C sK K RC K K

K K RRC RC

ω ω ω

Φ += =

Φ + + +

= − ≈ − ≈ −

KPDΦDIN ΦE R

KVCOs1

sC

ΦCKRXIE

Model

PDVCO

DRXDIN

ΦE

CDRRC

LF(s)UP

DN

CKRXIE

ω ωpH ωpL ωz

ΦCKRX ΦDIN

• 3dB bandwidth close to ωpH

19

Page 20: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Performance metrics for CDR• Jitter transfer (JTRAN)

• Jitter tolerance (JTOL)

• Jitter generation (JGEN)

JTRAN JTOLPeaking

[K. Kundert, 2010]

20

Page 21: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Hogge phase detector• Phase difference detection

• Data transition detection

CK

A

X(b)

[Razavi, 2002]

21

Page 22: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Sense amplifier

[J. Kim 2009]

• 4 operating phases: reset, sampling, regeneration, and decision

22

Page 23: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Flip-flop (SA+RS latch)

• RS latch holds data

during SA reset

• Check reference for improved version

RS Latch

[Nickolic 2000]

Sb Rb Q Qb

0 0 1 1

0 1 1 0

1 0 0 1

1 1 Hold

Truth table for RS latch

23

Page 24: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Charge pump & loop filter

[Arakali 2009]

• Chose R, and IUP(IDN) based on noise and bandwidth

(typical: R several kΩs, IUP (IDN) tens of µA)

• More order can be added to loop filter

UP

DN

RC1

VC

IB IUP

IDN

24

Page 25: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Area issue in analog CDR

Small peaking large C (nF) large area

2(s 1)(s)(s)

1 (s)1 1

CKRX PD VCO

DIN PD VCO PD VCO

z pL pH PD VCO

K K RCLGLG s C sK K RC K K

K K RRC RC

ω ω ω

Φ += =

Φ + + +

= − ≈ − ≈ −

Peaking

ω ωpH ωpL ωz

ΦCKRX ΦDIN

KPDΦDIN ΦE R

KVCOs1

sC

ΦCKRXIE

Model

PDVCO

DRXDIN

ΦE

CDRRC

LF(s)UP

DN

CKRXIE

25

Page 26: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Analog CDR to digital CDR

• Large capacitor is substituted by accumulator

• PVT insensitive, easy to configure loop dynamics

DPDDCODIN

ΦE

DRX

KI

z-1

KP

DigitalCDR

ACCKI

z-1

KPCKRX

Proportional control

Integral control

DE

LF(z)

+1,0,−1

ACC

DE

APDVCO

DRXDIN

ΦE Analog

CDR

RC

CKRX

LF(s)UP

DN

IE R

C

IE

26

Page 27: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Early/Late definition

[Razavi, 2002]• If S1 ⊕ S2 is high and S2 ⊕ S3 is low, the clock is Early

• If S1 ⊕ S2 is low and S2 ⊕ S3 is high, the clock is Late

• If S1 ⊕ S2 = S2 ⊕ S3, no data transition is present

27

Page 28: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Alexander phase detector

• Y = S2 ⊕ S3

• X = S1 ⊕ S2

• Also called bang-bang phase detector

[Razavi, 2002]X Y E L0 0 0 0

0 1 0 1

1 0 1 0

1 1 Undefined 28

Page 29: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Digitally controlled oscillator

KI

z-1

KP

DE +1,0,−1ACC

DINDAC

VDCO

20CKRX

DCO

• Ideal digital to analog converter model is available in Cadence library (ahdlLib/dac_8bit_ideal)

29

Page 30: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Mitigation of quantization noise

• Power

• Area

KI

z-1

KP

DE +1,0,−1ACC

DIN DOUT

ΔΣ modulator

DOUTDIN

20 417

DACVDCO

20 4

z-1

2nd order∆Σ modulator

NTF=(1-z-1)2z-1

2

−21

KI

z-1

KP

DE +1,0,−1ACC

DINDAC

VDCO

20CKRX

DCO

[Yin, 2011]

30

Page 31: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Other CDR architectures

• VCO-base single loop • PI based dual loop• Open loop• …• Check reference [7] for more

31

Page 32: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Dual loop CDR

PLL

PFD VCOLF

÷N

ΦINTBBPD DLFDIN

DPCK

CDR

[Hanumolu, 2008]

• Cascade of PLL and CDR loop

• PLL provides frequency

• CDR for phase lock

• Phase interpolator (ΦINT) interpolate btwn input phases

32

Page 33: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Outline• Serial link transceiver overview

• Transmitter (brief)

– Output driver

• Receiver

– Clock and data recovery

– Equalizer (brief)

33

Page 34: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Why equalizer?

Channel

|Hch(ω )|

ω Vin Vout

EqualizerVeq

|Heq(ω)|

ω

0dB0dB

200ps 200ps 200ps

• Equalizer can be done in both Tx side and Rx side

34

Page 35: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

TX equalizers

[Stauffer, 2008]

• Typical channel response with TX equalization in form of pre-emphasis FFE (Feed-Forward Equalizer)

• 3-Tap FIR FFE Architecture:a. Taps selected to generatetransfer function characteristicthat is the inverse of channeltransfer function

b. Peak power constraint35

Page 36: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Rx equalizers

• CT: difficult to get enough peaking at high frequency

• DT: timing is stringent at high data rate (loop unrolling helps)

• Configurable across standards

VIN VIP

ω

VO

|H(ω)|

2RZ

CZ/2

ADC

RL CL

1RLCL

1+gmRZ

RZCZ

1RZCZ

(a) Continuous time linear equalizer

(CTLE)

Σ D QVIN

CK

DRX

α

Tsetup

TCKQ

TFBTsetup+TCKQ+TFB<1UI

(b) Decision feedback equalizer (DFE)

36

Page 37: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Multilevel signaling

0

1

00011110

PAM2(NRZ) PAM4

0

1

Nyquist Frequency [GHz]0.0 1.0 2.0 3.0 4.0

|H(ω)|[dB]

-20

-40

-60

0

6.4Gb/s 6.4Gb/sPAM2PAM4

Ch. 1

Ch. 2

• PAM4 reduces Nyquist frequency by 2, and also reduces amplitude to 1/3 (20log10(3)≈9.54dB)

• For loss difference >9.54dB between two Nyquist frequencies, PAM4 can be beneficial

37

Page 38: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Summary

• Serial link transceiver overview

• Transmitter (brief)

– Output driver

• Receiver

– Clock and data recovery (focus)

– Equalizer (brief)

38

Page 39: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Reference[1] J. Hart, S. Butler, et al., "3.6GHz 16-core SPARC SoC processor in 28nm," in IEEE ISSCC Dig. Tech. Papers, Feb. 2013, pp. 17–21. [2] S. Palermo, ECEN 720 Lecture 6, Texas A&M University[3] B. Razavi, “Challenges in the Design of High-Speed Clock and Data Recovery Circuits,” IEEE Communications Magazine pp. 94–101, August, 2002.[4] Jaeha Kim, Brian S. Leibowitz, Jihong Ren, Christopher J. Madden, "Simulation and Analysis of Random Decision Errors in Clocked Comparators," IEEE Trans. Circuits and Systems-I, pp. 1844-1857, Aug. 2009.[5] A. Arakali, S. Gondi, and P. K. Hanumolu, “Analysis and Design Techniques for Supply-Noise Mitigation in Phase-Locked Loops,” Circuits Syst. I Regul. Pap. IEEE Trans., vol. 57, no. 11, pp. 2880–2889, Nov. 2010.[6] W. Yin, R. Inti, A. Elshazly, N. Sasidhar, and P. K. Hanumolu, “A TDC-Less 7 mW 2.5 Gb/s Digital CDR With Linear Loop Dynamics and Offset-Free Data Recovery,” IEEE J. Solid-State Circuits, vol. 46, no. 12, pp. 3163–3173, 2011.[7] M. Hsieh, “Architectures for Multi-Gigabit Wire-Linked Clock and Data Recovery,” IEEE Circuits and Systems Magazine pp. 45-57, September, 2008[8] Pavan Kumar Hanumolu; Gu-Yeon Wei; and Un-Ku Moon, “A Wide-Tracking Range Clock and Data,” Solid-State Circuits, IEEE J., vol. 43, no. 2, pp. 425–439, 2008.[9] K. Kundert, “Verification of Bit-Error Rate in Bang-Bang Clock and Data Recovery Circuits,”Designer’s Guide Publishing, 2010. Available from www.designersguide.org/Analysis.

39

Page 40: Design of Serial Link Transceiversjsa.ece.illinois.edu/ece546_2014/guanghua.pdf · Speed increases as technology scales and CMOS circuit delays scale proportionally • FO4 delay

Acknowledgment

• Prof. Pavan K. Hanumolu• Prof. José E. Schutt-Ainé• Rishi Ratan

40