View
228
Download
0
Tags:
Embed Size (px)
Citation preview
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 1
Pipeline SynchronizationPipeline SynchronizationContinuedContinued
This second part is based on the recent articleThis second part is based on the recent article
Bridging Clock Domains by Bridging Clock Domains by Synchronizing the Mice in the Synchronizing the Mice in the Mousetrap Mousetrap (PATMOS, Sep. 2003)(PATMOS, Sep. 2003)
byby
Joep Kessels and Ad PeetersJoep Kessels and Ad PeetersPhilips Research Laboratories, The NetherlandsPhilips Research Laboratories, The Netherlands
together withtogether with
Suk-Jin KimSuk-Jin Kim atat KJIST, South KoreaKJIST, South Korea
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 2
Recall Seizovic’s Recall Seizovic’s Synchronization PipelineSynchronization Pipeline
Seizovic, “Pipeline Synchronization,” Async 1994Kessels, Peeters, Kim, "Bridging Clock Domains by synchronizing the mice in the mousetrap", PATMOS, 2003
B clk
• Ripple Buffer between two clock domains– High throughput– Embedded synchronization– spanning a long distance 2-phase
half cycledistance
A clk
ME
ME
A clk
REQ
ME
B clk
ME
ME
ACK
ME
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 3
Which buffer to use?Which buffer to use?• Ripple Buffer
– Stream data (isochronous)•Throughput important, latency not•Steady rate maintained on both
sides– Short distance (2-3 stages)
•Pipe to improve throughput– or Long distance (many stages)
•Improve throughput and bridge distance
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 4
Which buffer to use?Which buffer to use?
• Pointer Buffer– Block data
•Chunk available at-once•Rate not important•No sense to ripple every word in all
pipe stages•Write few long bursts to SRAM and
read on other side, with pointers
– But if long distance, need Ripple
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 5
An ME as a SynchronizerAn ME as a Synchronizer• Outputs mutually exclusive :• Connect ~clk and signal ‘R’ to inputs• ‘A’ synced output, other output unused• Today we refer to ME with ~clk as WAIT4 component
S
clk
XR1
R0
A1
A0
clk
ME
R A
A +
R -
A -
R
Clk +Clk -
Clk=1
Clk=0
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 6
WAIT4WAIT4
•A is synced to clk
•Used in 4-phase, doesn’t sync A
•used as building block for 2-phase sync
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 7
One StageOne Stage
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 8
““Mousetrap Mousetrap Cell”Cell”
as FIFO as FIFO ElementElement
• 2-phase single-rail• Any hi/lo signal toggle
indicates change• reqǂack, sender cell is full• req=ack, data accepted by rcver, snder
empty• “Equal” gate implements “empty” when
req=ack• Cell empty all 4 ctrl signals equal
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 9
MT BehaviorMT Behavior
• Ignoring ‘empty’ signal,MT similar to Muller Pipeline:
([Rreq=Rack * WreqǂRreq]; Rreq := Wreq)*
Rack
WreqRreqc
Wack
(rcving cell empty)*(sending cell full); capture data, send(rcving cell empty)*(sending cell full); capture data, send
merely prevents idle operationsmerely prevents idle operations
([WreqǂRack * WreqǂRreq]; Rreq := Wreq)*
([WreqǂRack]; Rreq := Wreq)*
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 10
MousetraMousetrap p vs. vs.
MullerMuller• Muller
– Need to match delay of req to comb. logic– For 2-phase, need special Capture-Pass
Latch– When full, every other cell contains data
• Mousetrap– ‘empty’ no need for CP Latch– ‘empty’ does automatic delay-matching– When full, all cells contain data– No async elements (good for business)
creq
ack
req
ack
c req
ack
Latch
LatchLatch
Latch Latch
Comb.logic
del
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 11
• Rcver Ack to Snder does NOT indicate latch locked
• Latch locked T(EQ+HoldLatch) after Ack
• Timing restraint to ensure data not overrun
1) Snder Full
4) Rcver gets Rack from
outside
5) Rcver empties
EQ
3) Rcver stores data
EQ+HoldLatch
Latch
2)Rcver Ack
back & Rreq
forward
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 12
Delay Delay AsymmetriAsymmetri
es es • Delay of full/empty
token– Full: T(Latch),
Empty: T(Latch+EQ)– Phase-shift in
handshake signals– FIFO at full speed is
less than ½ full
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 13
Delay Asymmetries IIDelay Asymmetries II• Different inputs of a cell have
different delay-to-out– Connect slow EQ input to Ack to
help timing, or– …to Req to improve performance
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 14
Delay Asymmetries IIIDelay Asymmetries III• Signals’ rising/falling edges have
different transition delays Req precedes empty,empty
precedes Req– To avoid malfunction, ctrl-latch
always slower than data-latch
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 15
UE4UE4
• Parallel composition of two WAIT4 -> Up-Edge 4-phase detector
• Inv delay ensures 2nd WAIT4 closed before 1st opened
• Use a FF here instead?–doesn’t filter out the metastability
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 16
• Detect up & down edges for 2-phase• Build a Edge 2-phase detector UE2
– ‘d’ ifferent, ‘e’mpty– ‘U’ even though it is up-and-down– Note resemblance to MT ctrl logic
UE2UE2
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 17
Pipeline Pipeline InterfacesInterfaces
• FIFO indicates ready :– To receive new Wdat: Wrdy– To send new valid Rdat: Rrdy
• Environment enables:– Send of new valid Wdat: Wenb– Receive of new Rdat: Renb
• Data transfer if both rdy and enb– Transfer item every clock
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 18
Handshaking continues … at next Rclk, state repeats itself
Read-Read-InterfaceInterface
• Renb enables Rclk at FF– Z empty, Rrdy low,
handshake signals equal– Z becomes full, Rrdy hi,
handshakes differ– Upon next Rclk*Renb,
FF makes handshakes equal again
Following Rclk*Renb, Z passes new Rdat
After T(Latch+EQ), X empties into Y
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 19
Write-Write-InterfaceInterface
• Wenb enables Wclk at data+ctrl FF– ‘A’ full, handshake
signals differ– ‘A’ empty, Wack
toggles– Upon next
Wclk*Wenb,‘A’ receives new Wdat
1) C filled from B, ack from C waits at UE2 for Wclk
2) After Wclk, B gets ack, ‘A’ filled from outside
3) Handshaking continues … at next Wclk, state repeats itself
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 20
Integrated Integrated SynchronizinSynchronizing Circuit in g Circuit in
MT Write CellMT Write Cell
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 21
SummarySummary
• Pipeline Synchronization– High throughput, embedded sync,
long interconnect, 2-phase
• The Mousetrap Cell• Synchronization components
– WAIT4, UE4, UE2• Buffer Interfaces
– Write and Read sections• MT with integrated sync
circuit