21
Avshalom Elyada, Ran Ginosar Pipeline Synchronization 1 Pipeline Synchronization Pipeline Synchronization Continued Continued This second part is based on the recent article This second part is based on the recent article Bridging Clock Domains by Bridging Clock Domains by Synchronizing the Mice in the Synchronizing the Mice in the Mousetrap Mousetrap (PATMOS, Sep. 2003) (PATMOS, Sep. 2003) by by Joep Kessels and Ad Peeters Joep Kessels and Ad Peeters Philips Research Laboratories, The Netherlands Philips Research Laboratories, The Netherlands together with together with Suk-Jin Kim Suk-Jin Kim at at KJIST, South Korea KJIST, South Korea

Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 1

Pipeline SynchronizationPipeline SynchronizationContinuedContinued

This second part is based on the recent articleThis second part is based on the recent article

Bridging Clock Domains by Bridging Clock Domains by Synchronizing the Mice in the Synchronizing the Mice in the Mousetrap Mousetrap (PATMOS, Sep. 2003)(PATMOS, Sep. 2003)

byby

Joep Kessels and Ad PeetersJoep Kessels and Ad PeetersPhilips Research Laboratories, The NetherlandsPhilips Research Laboratories, The Netherlands

together withtogether with

Suk-Jin KimSuk-Jin Kim atat KJIST, South KoreaKJIST, South Korea

Page 2: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 2

Recall Seizovic’s Recall Seizovic’s Synchronization PipelineSynchronization Pipeline

Seizovic, “Pipeline Synchronization,” Async 1994Kessels, Peeters, Kim, "Bridging Clock Domains by synchronizing the mice in the mousetrap", PATMOS, 2003

B clk

• Ripple Buffer between two clock domains– High throughput– Embedded synchronization– spanning a long distance 2-phase

half cycledistance

A clk

ME

ME

A clk

REQ

ME

B clk

ME

ME

ACK

ME

Page 3: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 3

Which buffer to use?Which buffer to use?• Ripple Buffer

– Stream data (isochronous)•Throughput important, latency not•Steady rate maintained on both

sides– Short distance (2-3 stages)

•Pipe to improve throughput– or Long distance (many stages)

•Improve throughput and bridge distance

Page 4: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 4

Which buffer to use?Which buffer to use?

• Pointer Buffer– Block data

•Chunk available at-once•Rate not important•No sense to ripple every word in all

pipe stages•Write few long bursts to SRAM and

read on other side, with pointers

– But if long distance, need Ripple

Page 5: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 5

An ME as a SynchronizerAn ME as a Synchronizer• Outputs mutually exclusive :• Connect ~clk and signal ‘R’ to inputs• ‘A’ synced output, other output unused• Today we refer to ME with ~clk as WAIT4 component

S

clk

XR1

R0

A1

A0

clk

ME

R A

A +

R -

A -

R

Clk +Clk -

Clk=1

Clk=0

Page 6: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 6

WAIT4WAIT4

•A is synced to clk

•Used in 4-phase, doesn’t sync A

•used as building block for 2-phase sync

Page 7: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 7

One StageOne Stage

Page 8: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 8

““Mousetrap Mousetrap Cell”Cell”

as FIFO as FIFO ElementElement

• 2-phase single-rail• Any hi/lo signal toggle

indicates change• reqǂack, sender cell is full• req=ack, data accepted by rcver, snder

empty• “Equal” gate implements “empty” when

req=ack• Cell empty all 4 ctrl signals equal

Page 9: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 9

MT BehaviorMT Behavior

• Ignoring ‘empty’ signal,MT similar to Muller Pipeline:

([Rreq=Rack * WreqǂRreq]; Rreq := Wreq)*

Rack

WreqRreqc

Wack

(rcving cell empty)*(sending cell full); capture data, send(rcving cell empty)*(sending cell full); capture data, send

merely prevents idle operationsmerely prevents idle operations

([WreqǂRack * WreqǂRreq]; Rreq := Wreq)*

([WreqǂRack]; Rreq := Wreq)*

Page 10: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 10

MousetraMousetrap p vs. vs.

MullerMuller• Muller

– Need to match delay of req to comb. logic– For 2-phase, need special Capture-Pass

Latch– When full, every other cell contains data

• Mousetrap– ‘empty’ no need for CP Latch– ‘empty’ does automatic delay-matching– When full, all cells contain data– No async elements (good for business)

creq

ack

req

ack

c req

ack

Latch

LatchLatch

Latch Latch

Comb.logic

del

Page 11: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 11

• Rcver Ack to Snder does NOT indicate latch locked

• Latch locked T(EQ+HoldLatch) after Ack

• Timing restraint to ensure data not overrun

1) Snder Full

4) Rcver gets Rack from

outside

5) Rcver empties

EQ

3) Rcver stores data

EQ+HoldLatch

Latch

2)Rcver Ack

back & Rreq

forward

Page 12: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 12

Delay Delay AsymmetriAsymmetri

es es • Delay of full/empty

token– Full: T(Latch),

Empty: T(Latch+EQ)– Phase-shift in

handshake signals– FIFO at full speed is

less than ½ full

Page 13: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 13

Delay Asymmetries IIDelay Asymmetries II• Different inputs of a cell have

different delay-to-out– Connect slow EQ input to Ack to

help timing, or– …to Req to improve performance

Page 14: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 14

Delay Asymmetries IIIDelay Asymmetries III• Signals’ rising/falling edges have

different transition delays Req precedes empty,empty

precedes Req– To avoid malfunction, ctrl-latch

always slower than data-latch

Page 15: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 15

UE4UE4

• Parallel composition of two WAIT4 -> Up-Edge 4-phase detector

• Inv delay ensures 2nd WAIT4 closed before 1st opened

• Use a FF here instead?–doesn’t filter out the metastability

Page 16: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 16

• Detect up & down edges for 2-phase• Build a Edge 2-phase detector UE2

– ‘d’ ifferent, ‘e’mpty– ‘U’ even though it is up-and-down– Note resemblance to MT ctrl logic

UE2UE2

Page 17: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 17

Pipeline Pipeline InterfacesInterfaces

• FIFO indicates ready :– To receive new Wdat: Wrdy– To send new valid Rdat: Rrdy

• Environment enables:– Send of new valid Wdat: Wenb– Receive of new Rdat: Renb

• Data transfer if both rdy and enb– Transfer item every clock

Page 18: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 18

Handshaking continues … at next Rclk, state repeats itself

Read-Read-InterfaceInterface

• Renb enables Rclk at FF– Z empty, Rrdy low,

handshake signals equal– Z becomes full, Rrdy hi,

handshakes differ– Upon next Rclk*Renb,

FF makes handshakes equal again

Following Rclk*Renb, Z passes new Rdat

After T(Latch+EQ), X empties into Y

Page 19: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 19

Write-Write-InterfaceInterface

• Wenb enables Wclk at data+ctrl FF– ‘A’ full, handshake

signals differ– ‘A’ empty, Wack

toggles– Upon next

Wclk*Wenb,‘A’ receives new Wdat

1) C filled from B, ack from C waits at UE2 for Wclk

2) After Wclk, B gets ack, ‘A’ filled from outside

3) Handshaking continues … at next Wclk, state repeats itself

Page 20: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 20

Integrated Integrated SynchronizinSynchronizing Circuit in g Circuit in

MT Write CellMT Write Cell

Page 21: Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock

Avshalom Elyada, Ran Ginosar Pipeline Synchronization 21

SummarySummary

• Pipeline Synchronization– High throughput, embedded sync,

long interconnect, 2-phase

• The Mousetrap Cell• Synchronization components

– WAIT4, UE4, UE2• Buffer Interfaces

– Write and Read sections• MT with integrated sync

circuit