4
A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with RLM Compensation Changho Hyun, Hyeongjun Ko, Joo-Hyung Chae, Hyunkyu Park and Suhwan Kim Department of Electrical and Computer Engineering, Seoul National University Seoul 08826, South Korea E-mails: [email protected], [email protected] Abstract—In this paper, a 20Gb/s dual-mode four-level pulse amplitude modulation (PAM4)/non-return-to-zero (NRZ) single-ended voltage-mode transmitter is proposed. Its output drivers are composed of 60 basic source-series terminated (SST) driver units and 12 additional pull-up (PU) driver units. The additional PU driver units are used to reduce the eye height difference between four amplitude levels of PAM4. Implemented in 65nm CMOS technology, the active area of the transmitter is 0.06mm 2 including the clock buffer and IQ generator. It draws 61.5mW at 20Gb/s during PAM4 operation and 72mW at 10Gb/s during NRZ operation. Keywords—single-ended voltage-mode transmitter; four-level pulse amplitude modulation (PAM4); non-return-to-zero (NRZ); level separation mismatch ratio (RLM); dual-mode transmitter I. INTRODUCTION The demand for high-bandwidth graphic DRAM is constantly increasing, and recent devices need to transfer data at speeds greater than the 16Gb/s/pin achieved by GDDR6 memory [1]. Data-transfer speeds are limited by losses and discontinuities of channel, which cause inter-symbol interference (ISI). In addition, the continuing trend towards reduced supply voltage also makes it more difficult to increase data-transfer speeds when using non-return-to-zero (NRZ) signaling. Therefore, multi-level signaling techniques, such as duobinary or four-level pulse amplitude modulation (PAM4) can be one of the solutions to solve this problem. In particular, PAM4 is a promising option because 2-bits are transmitted in every symbol providing twice the data-rate of NRZ signaling at the same clock speed [2]. However, PAM4 only has 1/3 the eye height of NRZ, causing 9.5dB of signal-to-noise-ratio (SNR) attenuation [2]. The performance of PAM4 depends on the worst eye height, and so it is important to compensate for the difference of eye height [3]. The performance of the PAM4 signal can be found through level separation mismatch ratio (RLM). Several techniques are applied to compensate for RLM [4-6]. However, they may not be appropriate for memory interface which has restrictions on the number of pins, cost and power consumption because they adopt a differential structure or a current-mode driver [7]. To support various channels and memory standards, in this paper, we present a dual-mode PAM4/NRZ single-ended voltage-mode transmitter. Its output drivers are composed of basic source-series terminated (SST) driver units and additional pull-up (PU) driver units. The PU driver units of our output driver are used to alleviate the eye height difference between four amplitude levels of PAM4, improving the performance of PAM4 signaling. II. LEVEL SEPARATION MISMATCH RATIO RLM is an important factor in PAM4 signaling. It is determined by the smallest gap between four signal levels. If the gaps are A, B, and C, then the RLM is defined as in Fig. 1. When the differences between levels are similar, the RLM approaches 1 and the signal characteristics of PAM4 is improved. If there is no level compensation, RLM is much smaller than 1 as shown in Fig. 1(a), but RLM becomes close to 1 as shown in Fig. 1(b) when level compensation is applied. This research was supported by the MOTIE (Ministry of Trade, Industry & Energy) (10080570) and KSRC (Korea Semiconductor Research Consortium) support program for the development of the future semiconductor device. (a) (b) Fig. 1. Concept of a level separation mismatch ratio: (a) without level compensation (b) with level compensation. (a) (b) (c) Fig. 2. Basic driver unit for PAM4 operation: (a) driver structure, (b) driver structure when D1=1, D0=0, (c) driver structure when D1=0, D0=1. 978-1-7281-0397-6/19/$31.00 ©2019 IEEE

A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with ...analog.snu.ac.kr/Members/.../A20Gb_sDualModePAM4... · A block diagram of dual-mode PAM4/NRZ transmitter with additional

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with ...analog.snu.ac.kr/Members/.../A20Gb_sDualModePAM4... · A block diagram of dual-mode PAM4/NRZ transmitter with additional

A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with RLM Compensation

Changho Hyun, Hyeongjun Ko, Joo-Hyung Chae, Hyunkyu Park and Suhwan Kim

Department of Electrical and Computer Engineering, Seoul National University

Seoul 08826, South Korea

E-mails: [email protected], [email protected]

Abstract—In this paper, a 20Gb/s dual-mode four-level pulse amplitude modulation (PAM4)/non-return-to-zero (NRZ) single-ended voltage-mode transmitter is proposed. Its output drivers are composed of 60 basic source-series terminated (SST) driver units and 12 additional pull-up (PU) driver units. The additional PU driver units are used to reduce the eye height difference between four amplitude levels of PAM4. Implemented in 65nm CMOS technology, the active area of the transmitter is 0.06mm2 including the clock buffer and IQ generator. It draws 61.5mW at 20Gb/s during PAM4 operation and 72mW at 10Gb/s during NRZ operation.

Keywords—single-ended voltage-mode transmitter; four-level pulse amplitude modulation (PAM4); non-return-to-zero (NRZ); level separation mismatch ratio (RLM); dual-mode transmitter

I. INTRODUCTION

The demand for high-bandwidth graphic DRAM is constantly increasing, and recent devices need to transfer data at speeds greater than the 16Gb/s/pin achieved by GDDR6 memory [1].

Data-transfer speeds are limited by losses and discontinuities of channel, which cause inter-symbol interference (ISI). In addition, the continuing trend towards reduced supply voltage also makes it more difficult to increase data-transfer speeds when using non-return-to-zero (NRZ) signaling. Therefore, multi-level signaling techniques, such as duobinary or four-level pulse amplitude modulation (PAM4) can be one of the solutions to solve this problem. In particular, PAM4 is a promising option because 2-bits are transmitted in every symbol providing twice the data-rate of NRZ signaling at the same clock speed [2].

However, PAM4 only has 1/3 the eye height of NRZ, causing 9.5dB of signal-to-noise-ratio (SNR) attenuation [2]. The performance of PAM4 depends on the worst eye height, and so it is important to compensate for the difference of eye height [3]. The performance of the PAM4 signal can be found through level separation mismatch ratio (RLM). Several techniques are applied to compensate for RLM [4-6]. However, they may not be appropriate for memory interface which has restrictions on the number of pins, cost and power consumption because they adopt a differential structure or a current-mode driver [7].

To support various channels and memory standards, in this paper, we present a dual-mode PAM4/NRZ single-ended voltage-mode transmitter. Its output drivers are composed of basic source-series terminated (SST) driver units and

additional pull-up (PU) driver units. The PU driver units of our output driver are used to alleviate the eye height difference between four amplitude levels of PAM4, improving the performance of PAM4 signaling.

II. LEVEL SEPARATION MISMATCH RATIO

RLM is an important factor in PAM4 signaling. It is determined by the smallest gap between four signal levels. If the gaps are A, B, and C, then the RLM is defined as in Fig. 1. When the differences between levels are similar, the RLM approaches 1 and the signal characteristics of PAM4 is improved. If there is no level compensation, RLM is much smaller than 1 as shown in Fig. 1(a), but RLM becomes close to 1 as shown in Fig. 1(b) when level compensation is applied. This research was supported by the MOTIE (Ministry of Trade,

Industry & Energy) (10080570) and KSRC (Korea SemiconductorResearch Consortium) support program for the development of the futuresemiconductor device.

(a) (b) Fig. 1. Concept of a level separation mismatch ratio: (a) without level compensation (b) with level compensation.

(a)

(b) (c)

Fig. 2. Basic driver unit for PAM4 operation: (a) driver structure, (b) driver structure when D1=1, D0=0, (c) driver structure when D1=0, D0=1.

978-1-7281-0397-6/19/$31.00 ©2019 IEEE

Page 2: A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with ...analog.snu.ac.kr/Members/.../A20Gb_sDualModePAM4... · A block diagram of dual-mode PAM4/NRZ transmitter with additional

In a memory interface, reference impedance (ZQ) calibration is performed for NRZ signaling during the training sequence. The ZQ calibration determines the channel impedance termination at the driver to reduce the reflection by the impedance mismatch [7-8]. However, if ZQ calibration of the PAM4 signal is performed in the same way as the NRZ signal, the upper eye becomes larger than the lower eye in a dual-mode transmitter.

Fig. 2(a) shows the basic structure of the PAM4 driver, which consists of a most significant bit (MSB) driver and a least significant bit (LSB) driver. Fig. 2(b) shows what happens when the data D1=1 and D0=0 are applied. The resistance of the MSB driver drops below 75 ohms while that of the LSB exceeds 150 ohms. Then, the output voltage becomes lower than 1/6 of VDDQ. Similarly, when D1=0 and D0=1, the output voltage becomes lower than 1/3 of VDDQ as shown in Fig. 2(c). Consequently, the RLM of the PAM4 signal is distorted. Introducing additional drivers reduces this level mismatch even if ZQ calibration is done for an NRZ signal.

In the memory interface, a training operation is performed between the controller and the memory. The memory transmits the PAM4 signal, and the controller receives the signal passed through the channel. Then, the controller checks the RLM and can send a feedback signal for RLM compensation to the memory. Therefore, efficient RLM compensation can be performed by using additional drivers through communication with a memory controller.

III. DUAL-MODE PAM4/NRZ TRANSMITTER

ARCHITECTURE

A block diagram of dual-mode PAM4/NRZ transmitter with additional drivers to improve the RLM is shown in Fig. 3. An on-chip pseudo-random bit sequence (PRBS) generator creates 32-bit parallel data, which are sent to the 32:4 serializer (SER). A mode selector controlled by a mode selection signal, then determines whether the data should be treated as a PAM4 signal or an NRZ signal. The signal then passes to a retimer and an encoder, which has an equalization selector and a logic to apply PAM4 level compensation. A driver consists of 60 basic segments and 12 additional

segments, including four 4:1 multiplexers (MUXs), a 4:1 SER and an SST driver with a shared resistor.

A. Internal Clock Path

The internal clock path is depicted in Fig. 4. Two external clock signals CLKP and CLKN pass through the clock buffer and IQ generator, and inverter chains. Then, clock signals with a 90° phase difference are generated. MOS capacitors for delay control compensate for the skew between 4-phase clocks.

B. 32:4 Serializer, Encoder, and Retimer

The 32:4 serializer has three internal stages, which perform 32:16, 16:8, and 8:4 serialization as shown in Fig. 3. Each stage is made up of 2:1 SERs. In the case of PAM4 mode, 8 parallel data signals are sent to the encoder side; in NRZ mode, 4 parallel data signals are transmitted.

A retimer aligns both the PAM4 and NRZ data signals using one of the 4-phase clock signals from the internal clock path. After data signals are aligned with one clock signal, four parallel data signals with one unit-interval (UI) delay can be generated using all four phases. The data passed through the retimer are transmitted to the 4:1 MUX via the encoder and the feed-forward equalization (FFE) is determined by control signals in the encoder.

Fig. 3. Dual-mode PAM4/NRZ transmitter architecture.

Fig. 4. Internal clock path.

Page 3: A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with ...analog.snu.ac.kr/Members/.../A20Gb_sDualModePAM4... · A block diagram of dual-mode PAM4/NRZ transmitter with additional

Level compensation logic is included in the encoder and retimer block, which outputs 0 when D0=0 and D1=1, or when D0=1 and D1=0, and outputs 1 when D0 and D1 are both 0 or both 1.

C. Output Driver

The output driver contains 60 basic driver segments, which have the structure depicted in Fig. 3. Each of these segments contains four 4:1 MUXs, a 4:1 SER, and an SST driver with a shared resistor. The PAM4 signal that passes through the driver does not have levels that are not equally spaced, as shown in Fig 1(a); this reduces the RLM. We, therefore, introduce 12 additional PU drivers, arranged in parallel, to raise the two intermediate levels of the PAM4 signal.

Fig. 5 shows the basic SST driver unit and additional PU driver unit. The basic SST driver unit contains two PMOS transistors and two NMOS transistors to control the termination resistance. In the SST driver unit, controllability can be improved with a large number of transistors, however, as the driver size becomes larger, the capacitance of the output increases which makes the high-speed operation difficult. The additional PU driver unit has just two PMOS transistors. Hence, the overall output driver size can be reduced.

IV. EXPERIMENTAL RESULTS

A prototype of the dual-mode PAM4/NRZ transmitter was fabricated in a 65nm CMOS process. Its total active area is 0.06mm2, including the internal clock path. The measurement setup and die micrograph are shown in Fig. 6. A single-ended clock signal generated by the clock

generator passes through a single-to-differential converter to produce two differential clocks, and 4-phase quarter-rate clock signals are then generated inside the chip. All experiments were performed using a PRBS7 pattern, and output waveforms were viewed and measured on an oscilloscope.

Fig. 7(a) shows an eye diagram for PAM4 operation at 16Gb/s, clocked by a 4GHz clock signal from a clock

Fig. 5. A basic source-series terminated (SST) driver unit with additional pull-up (PU) driver unit.

Fig. 6. Measurement setup and chip micrograph.

(a) (b) Fig. 7. Measured eye diagrams of PAM4 operation at (a) 16Gb/s PAM4 (b) 20Gb/s PAM4.

(a) (b) (c) Fig. 8. Measured PAM4 eye diagrams at 20Gb/s: (a) without equalization or level compensation, (b) with equalization alone, and (c) with both equalization and level compensation.

Fig. 9. Measured eye diagram of NRZ operation at 10Gb/s.

Page 4: A 20Gb/s Dual-Mode PAM4/NRZ Single-Ended Transmitter with ...analog.snu.ac.kr/Members/.../A20Gb_sDualModePAM4... · A block diagram of dual-mode PAM4/NRZ transmitter with additional

generator, and Fig. 7(b) shows the PAM4 operation at 20Gb/s with a 5GHz clock signal. The size of the eye at 16Gb/s is larger because the timing margin is greater than it is at 20Gb/s.

The effect of the equalizer and level compensation circuit on the eye diagram at 20Gb/s is shown in Fig. 8. The eye-opening is very small without equalization by comparing Fig. 8(a) and Fig. 8(b). In Fig. 8(b), three eye heights are 60.5mV, 58.2mV, and 39.7mV respectively from the top by using 2-tap FFE with 0.8, -0.2 tap coefficient. Applying the level compensation method with the same tap coefficient, the height of three eyes are 56.1mV, 53.4mV, and 54.3mV from the top in Fig. 8(c). A comparison of Fig. 8(b) and 8(c), we can indicate that the level compensation increases the RLM from 0.75 to 0.98. Fig. 9 shows the eye diagram for an NRZ signal at 10Gb/s. The tap coefficients were 0.9 and -0.1 in this experiment and its eye height is 247.8mV.

Our dual-mode transmitter consumes 61.5mW at 20Gb/s in PAM4 operation, and 72mW at 10Gb/s in NRZ operation. Fig. 10 shows the power breakdown of the proposed transmitter. 4:1 MUX and SER occupies the largest share of 49%. Next, more power is consumed in the order of output

driver, encoder and retimer, and clock tree. Table I compares the performance of our transmitter with that of other dual-mode transmitters. Our transmitter is the only one that uses single-ended signaling, because it is designed for memory interfaces.

V. CONCLUSION

We implemented a PAM4/NRZ dual-mode single-ended voltage-mode transmitter with RLM enhancement for memory interfaces in 65nm CMOS process. RLM can be improved by controlling the two intermediate levels of PAM4 with additional drivers. The transmitter consumes 61.5mW at 20Gb/s during PAM4 operation and 72mW at 10Gb/s during NRZ operation.

REFERENCES [1] K. D. Hwang, et al., “A 16Gb/s/pin 8Gb GDDR6 DRAM with

bandwidth extension techniques for high-speed applications,” IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, San Francisco, Feb. 2018, pp. 210-211.

[2] N. Dikahaminjia, et al., “High-speed serial link challenges using multi-level signaling,” Proc. IEEE 24th Elect. Perform. Electron. Packag. Syst., San Jose, CA, 2015, pp. 57-60.

[3] P. Upadhyaya, et al., “A fully adaptive 19-to-56Gb/s PAM-4 wireline transsceiver with a configurable ADC in 16nm FinFET,” IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, San Francisco, Feb. 2018, pp. 108-109.

[4] M. Bassi, F. Radice, M. Bruccoleri, S. Erba, and A. Mazzanti, “A 45Gb/s PAM-4 transmitter delivering 1.3Vppd output swing with 1V supply in 28nm CMOS FDSOI,” IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, San Francisco, Feb. 2016, pp. 66-67.

[5] A. Nazemi, et al., “A 36Gb/s PAM4 transmitter using an 8b 18GS/s DAC in 28nm CMOS,” IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, San Francisco, Feb. 2015, pp. 58-59.

[6] X. Zheng, et al., “A 4-40Gb/s PAM4 transmitter with output linearity optimization in 65nm CMOS,” Proc. IEEE Custom Integr. Circuits Conf., Austin, TX, Jul. 2017, pp 1-4.

[7] J. –H. Chae, H. J. Ko, J. H. Park, and S. H. Kim, “A 12.8Gb/s quarter-rate transmitter using a 4:1 overlapped multiplexing driver combined with an adaptive clock phase aligner,” IEEE Trans. Circuits Syst. II, Exp. Briefs, in press.

[8] C. K. Lee, et al., “A 5Gb/s/pin LPDDR4X SDRAM with power-isolated LVSTL and split-die architecture with 2-die ZQ calibration scheme,” IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, San Francisco, Feb. 2017, pp. 390-391.

[9] A. Roshan-Zamir, O. Elhadidy, H. W. Yang, and S. Palermo, “A reconfigurable 16/32 Gb/s dual-mode NRZ/PAM4 SerDes in 65nm CMOS,” IEEE J. of Solid-State Circuits, vol. 52, no.9, Sep. 2017, pp. 2430-2447.

[10] J. Kim, et al., A 16-to-40Gb/s quarter-rate NRZ/PAM4 dual-mode transmitter in 14nm CMOS,” IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, San Francisco, Feb. 2015, pp. 60–61.

[11] H. Liu, L. Ding, J. Jin, and J. Zhou, “A reconfigurable 28/56 Gb/s PAM4/NRZ dual-mode SerDes with hardware-reuse,” in IEEE Int. Symposium on Circuits and Systems, Florence, Italy, May 2018, pp. 1-5.

Fig. 10. Power breakdown of the transmitter.

TABLE I. PERFORMANCE SUMMARY AND COMPARISON TABLE

Reference [9] [10] [11] This work Technology 65nm 14nm 22nm 65nm Supply (V) 1.2 N/A 1.2 1.0 Modulation PAM4 NRZ PAM4 NRZ PAM4 NRZ PAM4 NRZ Signaling Differential Differential Differential Single-ended

Driver Topology

Voltage-mode SST

Voltage-mode SST

CML Voltage-mode

SST Data rate

(Gb/s) 32 16 40 40 56 28 20 10

Equalization (TX FFE)

2-tap 4-tap No EQ

4-tap No EQ

4-tap 2-tap 4-tap

Power (mW) 176.3 173.7 167.5 518 N/A N/A 61.5 72 RLM 0.967 N/A N/A 0.98

Area (mm2) 0.06 0.0279 N/A 0.06