10
1 $QDO\VLVDQG$YRLGDQFHRI $QDO\VLVDQG$YRLGDQFHRI &URVV &URVV WDONLQRQ WDONLQRQ FKLSEXVHV FKLSEXVHV Chunjie Duan Ericsson Wireless Communications Anup Tirumala Jasmine Networks Sunil P Khatri University of Colorado, Boulder 2XWOLQH 2XWOLQH Introduction Classification of Cross-talk types Eliminating 3C and 4C sequences Eliminating 4C sequences Experimental Results Conclusions

VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

Embed Size (px)

Citation preview

Page 1: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

1

$QDO\VLV�DQG�$YRLGDQFH�RI�$QDO\VLV�DQG�$YRLGDQFH�RI�&URVV&URVV��WDON�LQ�RQWDON�LQ�RQ��FKLS�EXVHVFKLS�EXVHV

Chunjie DuanEricsson Wireless Communications

Anup Tirumala Jasmine Networks

Sunil P KhatriUniversity of Colorado, Boulder

2XWOLQH2XWOLQH

� Introduction

� Classification of Cross-talk types

� Eliminating 3C and 4C sequences

� Eliminating 4C sequences

� Experimental Results

� Conclusions

Page 2: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

2

,QWURGXFWLRQ,QWURGXFWLRQ

� Verified cross-talk trends� Accurate 3-D capacitance extraction� Delay variation 2.47:1 (200 µm wires, 10X drivers,

0.1 µm technology)

� Deep sub-micron process

s t

wa v aCI

CL

v

a

CLCL

CIa v a

CL

v

a

CL

CICI

CL

a av

a

CL

v

CLCL

CI CI

a

CIa av

v

CI

CL CL CL

CICI

CL CL CL

CI

CL

CI

CL CL

&URVV&URVV��WDON�WDON�YVYV %XV�'DWD�3DWWHUQ%XV�'DWD�3DWWHUQ

� :KHQ� �a���� P���U� �&I/CL > 10 (metal 4)

� Effective total capacitance depends on bus data sequence :

� Best case: 0 x CI x L

� Worst case: 4 x CI x L

0·CI

Ctotal = 0 ·CICtotal = 4 ·CI

0·CI 2·CI 2·CI

Page 3: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

3

&ODVVLILFDWLRQ�RI�&URVV&ODVVLILFDWLRQ�RI�&URVV��WDONWDON

� 4·C sequence:

� 3·C sequence

� 2·C sequence

� 1·C sequence

� 0·C sequence

� Forbidden patterns (“010” and “101”)

or

(OLPLQDWLQJ��&���&�6HTXHQFHV(OLPLQDWLQJ��&���&�6HTXHQFHV� Motivation

� Maximum bus data rate depends on total capacitance seen by any bit

� Removing 3C and 4C sequences will increase the maximum data rate

� Simple approach: shielding� g s g s g s g... (ground line between signals)

� No 3C or 4C sequences possible

� However, bus-width is doubled

� Coding gain = (throughput/area)with coding

(throughput/area)without coding

� Coding gain = 0 for this approach

- 1

Page 4: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

4

(OLPLQDWLQJ��&���&�6HTXHQFHV(OLPLQDWLQJ��&���&�6HTXHQFHV

� Theorem: If no forbidden patterns are allowed on the bus, � Proof: see paper

�� Our approachOur approach:� Encode the data on the bus to get rid of the

forbidden patterns

� Questions to be answered:� What is the number of redundancy bits (and the

coding gain)?

� How to practically implement such a CODEC ?

Itotal CC ×= 2max

1XPEHU�RI�5HGXQGDQF\�%LWV1XPEHU�RI�5HGXQGDQF\�%LWV

� Map the n bit bus to a k=n+r bit bus so that� the k bit data bus has no forbidden patterns

� Definitions:� T(n): number of distinct n-bit vectors.

� T(n)=2n

� TB(n): number of n-bit vectors which contain a forbidden pattern

� TG(n): number of n-bit vectors which do not contain forbidden patterns � Let the sets of vectors be V(n), VB(n), and VG(n) respectively

� Let v(n), vB(n) and vG(n) respectively represent an element of these sets

� TGG(n): Number of n-bit vectors in VG(n) with last 2 bits ‘00’ or ‘11’

� TGB(n): number of n-bit vectors in VG(n) with last two bits ‘01’ or ‘10’

� Goal: to find the smallest k such that n

G nTkT 2)()( =≥

Page 5: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

5

&RXQWLQJ�)RUELGGHQ�9HFWRUV&RXQWLQJ�)RUELGGHQ�9HFWRUV

� v(n) can be constructed by appending {0,1} to any v(n-1)� Two v(n) are constructed from any v(n-1)

� Two vB(n) are constructed from any vB(n-1)� xxx010xx -> xxx010xx0, xxx010xx1

� One vGG(n) and one vGB(n) are constructed from any vGG(n-1)� xxxxxx00 -> xxxxxx000, xxxxxx001

� One vGG(n) and one vB(n) are constructed from any vGB(n-1)� xxxxxx01 -> xxxxxx010, xxxxxx011

&RXQWLQJ�)RUELGGHQ�9HFWRUV&RXQWLQJ�)RUELGGHQ�9HFWRUV

� Algorithm� Initial conditions (n=3)

� T(3) = 8, TG(3) = 6, TB(3)=2, TGG(3)=4, TGB(3)=2

� Inductive step� T(n) = 2 x T(n-1);

� TG(n) = 2 x TG(n-1) + TG(n-1)

� TGG(n) = TGG(n-1) + TGB(n-1)

� TB(n) = 2 x TB(n-1) + TGB(n-1)

Page 6: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

6

(OLPLQDWLQJ��&���&�VHTXHQFHV�(OLPLQDWLQJ��&���&�VHTXHQFHV�

� 44% overhead when n > 30 bits

� Coding gain %39144.01

2 =−+

=G

overhead percentage

0.00E+00

5.00E-02

1.00E-01

1.50E-01

2.00E-01

2.50E-01

3.00E-01

3.50E-01

4.00E-01

4.50E-01

5.00E-01

0 10 20 30 40 50 60 70 80 90 100

�&���&�&2'(&�,PSOHPHQWDWLRQ�&���&�&2'(&�,PSOHPHQWDWLRQ

� Implements a one-to-one map from V(n) to VG(k)� Look-Up Table, straightforward, can achieve

minimum overhead (44%), but not practical

� Our implementation� 62.5% overhead (higher than minimum)

� Modular and straightforward� Break bus into 4-bit groups

� Encode each group independently (4bit -> 5 bit)

� Additional logic to handle across-the-boundary forbidden patterns

� Ripple effect (Eliminated by pipelining)

Page 7: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

7

�&���&�&2'(&�,PSOHPHQWDWLRQ�&���&�&2'(&�,PSOHPHQWDWLRQ

CODEC block diagram

Input output 0000 00000 0001 00001 0010 00110 0011 00011 0100 01100 0101 00111 0110 01110 0111 01111 1000 11111 1001 11110 1010 11001 1010 11100 1100 10011 1101 11000 1110 10001 1111 10000

b0b1b2b3

b4b5b6b7

b8b9b10b11

b12b13b14b15

(OLPLQDWLQJ��&�VHTXHQFHV(OLPLQDWLQJ��&�VHTXHQFHV� Less aggressive: eliminating 4C sequences only

� Less overhead (33%) : simpler implementation

� Simpler algorithm� Divide the bus into 3 bit groups

� When 4C sequence occurs, complement group data

� Insert group complement indicator

� Special handling for across-the-boundary forbidden sequences (see paper for details)

� Examples:� 101 001 -> 010 010

� 1010 0010 -> 1011 0100

Page 8: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

8

([SHULPHQWDO�5HVXOWV([SHULPHQWDO�5HVXOWV

� Bus simulations� CODEC was not modeled

� 6SLFH������ P�PRGHO

� Transmission line with inter-wire coupling

� Quantify delay dependency on bus vector sequences

� CODEC implementation� Currently implemented 3C & 4C CODEC

� Matching delay on CODEC outputs

� 4C CODEC implementation planned in future

%XV�6LPXODWLRQ�5HVXOWV%XV�6LPXODWLRQ�5HVXOWV� Bus length 5mm, 10mm or 20mm

� Driver strength 30X, 60X and 120X of minimum

DELAY comparison(1mm trace)

-1.00E+00

-5.00E-01

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

0 1 2 3 4 5 6

0c

1C

2C

3C

4C

DELAY comparison(2mm trace)

-1.00E+00

-5.00E-01

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

0 1 2 3 4 5 6 7 8

0C

1C

2C

3C

4C

Trc_len Buf_size 0C 1C 2C 3C 4C10mm 30x <100 200 350 550 75010mm 60x <100 100 250 400 50010mm 120x <100 120 170 300 35020mm 30x 100 300 600 1000 160020mm 60x 100 250 400 600 90020mm 120x <100 150 300 550 750

Page 9: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

9

&2'(&�5HVXOWV&2'(&�5HVXOWV� Compare waveform with coding and w/o coding

� Random input sequence

Random

sequence

Recovered sequence

encoder decoderdriver receiver

Random

sequence

Recovered sequence

encoder decoderdriver receiver

� Encoder/decoder delay ~250ps

� Max data rate more than 2X compared to scheme with no encoding

&2'(&�5HVXOWV&2'(&�5HVXOWV

� random sequence directly into bus buffer

� 20mm trace

� 45x buffer

� >1ns delay variation

� Random sequence into 3C & 4C encoder

� 20mm trace

� 45x buffer

� <500ps delay variation

received data pattern w coding

-5.00E-01

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

0 2 4 6 8 10 12 14 16 18

Vin1

Vseg1

Vseg2

Vseg3

Vseg4

Vseg5

waveform w/o encoder

-4.00E-01

-2.00E-01

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1.40E+00

1.60E+00

0 2 4 6 8 10 12 14 16 18 20

Vtx1

Vtx2

Vtx3

Vtx4

Vtx5

Page 10: VLVDQG$YRLGDQFHRI &URVV WDONLQRQ … · 0101 00111 0110 01110 0111 01111 ... 101 001 -> 010 010 ... received data pattern w coding-5.00E-01 0.00E+00 5.00E-01 1.00E+00 1.50E+00

10

([SHULPHQWDO�5HVXOWV([SHULPHQWDO�5HVXOWV

Reshaped data after receivers

� w/o coding,

� edge jitter ~ 1000ps

� w coding

� edge jitter <500ps

delay variation w/o coding

-2.00E-01

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1.40E+00

0 2 4 6 8 10 12 14 16 18

Voo1

Voo2

Voo3

Voo4

Voo5

received data w coding

-2.00E-01

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1.40E+00

0 2 4 6 8 10 12 14 16

rcv1

rcv2

rcv3

rcv4

rcv5

&RQFOXVLRQV&RQFOXVLRQV

� Inter-wire capacitance increasingly significant in DSM VLSI interconnect

� Total capacitance is heavily dependent on bus data sequence

� With 44% overhead, we can eliminate 3C & 4C cross-talk � Compared to shielding, which has 100% overhead

� Implemented CODEC to eliminate 3C and 4C cross-talk sequences

� Proposed CODEC to eliminate 4C cross-talk sequences with 33% overhead

� Simulation results match our analysis.