Upload
mahesh
View
234
Download
0
Embed Size (px)
Citation preview
8/9/2019 11 Clock Skew
1/35
Clock Skew 1
Clock Skew
8/9/2019 11 Clock Skew
2/35
Clock Skew 2
Clock Skew
Probably the largest source of design
failures in FPGA.
8/9/2019 11 Clock Skew
3/35
Clock Skew 3
Definitions
Sequentially-Adjacent Register Pair
Two registers with only combinational logic or
interconnect between them
Clock Skew
Given two sequentially-adjacent registers, Ri and Rj,and an equipotential clock distribution network, the
clock skew between these two registers is defined as
Tskew-i,j = Tci - Tcj
where Tci and Tcj are the clock delays from the clock
source to the registers Ri and Rj, respectively.
8/9/2019 11 Clock Skew
4/35Clock Skew 4
Implications of Definitions
Sequentially-Adjacent Register Pair
Skew is only meaningful between adjacent pairs of
registers, not between any pair of registers in a clockdomain.
- Clock skew is only important between the pairs (FF1, FF2) and (FF2, FF3).
- The skew between the pair (FF1, FF3) is meaningless.
- Calculations done quickly on sets is often unrealistic.
8/9/2019 11 Clock Skew
5/35Clock Skew 5
Implications of Definitions
Definition of skew: from the clock source
A circuit node that is common in time for both paths is
the origin of skew calculations.
- A is the common point in time for this circuit.- B and C are not common. Why? B1 and B2 have different transitors.
The loading on B and C are different. B and C can have different line
lengths. Parasitic capacitances can be different. Etc. Lines on a
schematic are not ideal wires.- B1 and B2 will not track 100%. This is shown later for aging.
8/9/2019 11 Clock Skew
6/35Clock Skew 6
A First Skew Calculation
Two cases: 1) A B is fast, A C slow
2) A B is slow, A C fast
Just show the case 1 calculation (hold time)
Skew = tplh B1min - tplh B2max
Is this correct?
Is this realistic?
8/9/2019 11 Clock Skew
7/35Clock Skew 7
A First Skew Calculation
Skew = tplh B1min - tplh B2max
Is this correct?
No, it is wrong. It assumes that the signal A
arrives at the inputs to B1 and B2 at the sametime. In an FPGA in particular, they will not,
making this calculation too optimistic.
8/9/2019 11 Clock Skew
8/35
Clock Skew 8
A First Skew Calculation
Skew = tplh B1min - tplh B2max
Is this realistic?
It is wrong if you use extreme value analysis without thinking, assuming that
this circuit is on one chip and would make thes calculation too conservative.
For example, temperature and voltage wont have drastic differences on the
same chip at the same time. Life and radiation effects can not be assumed to
track. Note that calculations must be done over all corners.
8/9/2019 11 Clock Skew
9/35
Clock Skew 9
A Zeroth Skew Calculation
For this simpler case, we have to calculate the time from A B (min) and from
A C (max) to compute the worst-case skew.
The input to B1, A, is common to both, so that is our common timing point.
The paths from the output of B1 to FF1:CLK and to FF2:CLK are different.
But tplh (B1) is in common. Legally, that can be factored out, eliminating the
unrealistic conservatism of having a transistor be both good and bad at the
same time. Practically, however, some tools lump the gate and routing delaystogether, so the calculation is needlessly conservative.
8/9/2019 11 Clock Skew
10/35
Clock Skew 10
Tools: Your Friend?
Some tools try and calculate skews for you. However, you must
understand the model that it is using and apply it wisely.
8/9/2019 11 Clock Skew
11/35
Clock Skew 11
Effects of Skew
Assume: Sequentially-Adjacent, Rising Edge-Triggered Flip-Flops
Routing delay bundled with gate delay
Case 1: Setup time
1 Clock Period
Ideal CLK
8/9/2019 11 Clock Skew
12/35
Clock Skew 12
Effects of Skew
Assume: Sequentially-Adjacent, Rising Edge-Triggered Flip-Flops
Case 1: setup time
With an ideal, skew-free clock, we have:
Add delays: FF1:CLKQ(max)
G1tP(max)
FF2:tSU
Margin = tperiod - total
8/9/2019 11 Clock Skew
13/35
Clock Skew 13
Effects of Skew
Add Worst-Case Skew for tSU
1 Clock Period
Ideal CLK
FF1:CLK
Available Time
Decreased by tSKEW
FF2:CLK
8/9/2019 11 Clock Skew
14/35
Clock Skew 14
Effects of Skew
Assume: Sequentially-Adjacent, Rising Edge-Triggered Flip-Flops
Routing delay bundled with gate delay
Case 2: Hold time
Ideal FF1:CLK
D
Ideal FF2:CLK
E
tH met
8/9/2019 11 Clock Skew
15/35
Clock Skew 15
Effects of Skew
Early FF1:CLK
D
Late FF2:CLK
E
Note: used min, best case for prop delays and max, worst-case for
clock path to FF2.
8/9/2019 11 Clock Skew
16/35
Clock Skew 16
Clock Skew, Our Friend
Data
Clock
Attempting to force the downstream flip-flops to be clocked
before the upstream flip-flops can change their data. This
improves hold time at the cost of setup time.
8/9/2019 11 Clock Skew
17/35
Clock Skew 17
Clock Skew, Our Friend
FF1: Earliest Data ChangeFF2: Latest Sample
CLKB
The use of opposite edge clocking makes this circuit skew-
tolerant, at the cost of setup time, typically not an issue for shift
registers. Additionally, the clock can be run at 1/2 speed, saving
power.
8/9/2019 11 Clock Skew
18/35
Clock Skew 18
Clock Skew, Our Friend
FF1: Earliest Data ChangeFF2: Latest Sample
PH1
PH2
The use of two-phase clocking makes this circuit skew-tolerant,at the cost of setup time.
8/9/2019 11 Clock Skew
19/35
Clock Skew 19
Clock Skew, Our Friend
L1: Earliest Data ChangeL2: Latest Sample
PH1
The use of two-phase, non-overlapping clocking makes thislatch-based circuit skew-tolerant.
PH2
8/9/2019 11 Clock Skew
20/35
Clock Skew 20
Clock Skew, Our Friend
For pipelined designs with unequal
combinational delays between latches or
registers, clock skew can provide a better fit.
Data
Clock
8/9/2019 11 Clock Skew
21/35
Clock Skew 21
Design Impacts of Clock Skew
Calculate skews carefully
setup time
hold time
Use low-skew clock resources when
practical Understand skew-tolerant circuit when it is
not practical to use low-skew clock
resources
Some ASICs require a minimum of 1 gate
between sequentially adjacent registers
8/9/2019 11 Clock Skew
22/35
Clock Skew 22
Clock Skew
D Q D Q D Q D Q
Normal Routing Resource
Shift register is given as an example. Also seen
in counters and other logic structures.
8/9/2019 11 Clock Skew
23/35
Clock Skew 23
Clock Skew
D Q D Q D Q D Q
D Q D Q D Q D Q
Clock trees are made to increase fanout.
Not placing buffers and flip-flops on the same row Can increase skew problem.
8/9/2019 11 Clock Skew
24/35
Clock Skew 24
Clock Skew - Timing Model
D Q D Q
TCQ TROUTE
TSKEW
TH
FF1 FF2
Hold time at FF2 is the concern.
Worst-case
Low VIH FF1 Hi VIH FF2
Fast TCQ, TRoute High T
SKEW TCQ + TROUTE + TH > TSKEW
8/9/2019 11 Clock Skew
25/35
Clock Skew 25
14
13
12
11
10
9
8
7
6
5
4
23 24 25 26 27 28 29 30 31 32 33 34 35
Local Clock: Physical Realization
The net CMDREG/CLK3 driven at location XY = (27, 5) uses an LVT.LVT data: column = 30, Y-span = (14, 6).
Net data: fanout = 13, Y-spread of inputs = (13, 5).
LVT
Note: Antifuse located
at each junction.
6
7
D
R
4 5
9
8/9/2019 11 Clock Skew
26/35
8/9/2019 11 Clock Skew
27/35
Clock Skew 27
Clock Skew - Timing Analysis
Most static timing analyzers give bounded numbers for min, max.
Just setting MAX or MIN does not account for variations as a resultof fabrication differences, anti-fuse resistance, changes as a result of
aging, etc, and will be too liberal.
A full MIN/MAX analysis is too conservative since elements near eachother on the same die cannot vary that widely. I.e., one part cant be at
4.5VDC, the other at 5.5VDC.
For each environmental condition, it is fair to hold temperature, voltage,fixed.
MIN/MAX will still be a bit conservative, since will range over all
manufacturing conditions, not limited to variation within a single die.
8/9/2019 11 Clock Skew
28/35
Clock Skew 28
Antifuse Resistance VariationONO Antifuse Resistance Distribution
Programming Current = 5mA[from Antifuse FPGAs, J. Greene, et. al.]
Resistance
300 400 500 600 700 800 900
%D
istribution
0
10
20
30
40
50
60
P D l D lt Lif
8/9/2019 11 Clock Skew
29/35
Clock Skew 29
Prop Delay Delta vs. Life
Change in delay time from a 1000 gate programmablegate after 1000 hours dynamic burn-in at 125 C and 5.75V.
P D l D lt Lif
8/9/2019 11 Clock Skew
30/35
Clock Skew 30
Prop Delay Delta vs. LifeRH1280 Change in Propagation Delay
After 1000 Hour Life TestTested at 4.5 Volts, 125C
Change in Propagation Delay (ns)-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
NumberofSa
mples
0
5
10
15
NOTES:
1. All delay values rounded to thenearest integer number of ns.
2. The average propagation delaywas 106.5ns before the life test.
P D l D lt Lif
8/9/2019 11 Clock Skew
31/35
Clock Skew 31
Prop Delay Delta vs. Life
RH1280 Change in Propagation DelayAfter 1000 Hour Life Test
Tested at 4.5 volts, 125C
Change in Propagation Delay (percent)-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
NumberofS
amples
0
5
10
15
20
25
Note: Over a long path, 16 modules + I/O, TP exceeding 100 ns.
Cl k Sk F VHDL
8/9/2019 11 Clock Skew
32/35
Clock Skew 32
Clock Skew - From VHDL
Coding ExampleLibrary IEEE;
Use IEEE.Std_Logic_1164.All;
Entity Skew Is
Port ( Clk : In Std_Logic;
D : In Std_Logic;
Q : Out Std_Logic );
End Skew;
Library IEEE;Use IEEE.Std_Logic_1164.All;
Architecture Skew of Skew Is
Signal ShiftReg : Std_Logic_Vector (31 DownTo 0);
Begin
P: Process ( Clk )
Begin
If Rising_Edge (Clk)
Then Q
8/9/2019 11 Clock Skew
33/35
Clock Skew 33
Clock Skew - From VHDL
Synthesized Results
Results will depend on coding, directives and attributes, synthesizer, andsynthesizer revision.
Here we see that the logic synthesizer generated a poor circuit.
8/9/2019 11 Clock Skew
34/35
Clock Skew 34
Clock Skew Correction
No PRESERVE
High-skew Clock
8/9/2019 11 Clock Skew
35/35
Clock Skew 35
Clock Skew - Chip-to-Chip
D Q
FF1
D Q
FF2
Analysis may show problems. Some architectures are
designed with 0 ns tH
; others incorporate delay elements
(configurable) on the data inputs to ensure reliable clocking.