32
Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

Embed Size (px)

Citation preview

Page 1: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

Advanced Digital DesignAsynchronous EDA

by A. Steininger, J. Lechner and R. NajvirtVienna University of Technology

Page 2: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 2Lecture "Advanced Digital Design"

Overview

Synchronous-Asynchronous Direct Translation (SADT)

Null Convention Logic Syntax Directed Compilation (Balsa) Martin Synthesis (Caltech

Asynchronous Synthesis Tools)

Page 3: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 3Lecture "Advanced Digital Design"

Synchronous-Asynchronous Direct Translation (SADT)

Starting point: synchronous circuit description in a standard HDL

Synthesis with conventional tools into sync. gate-level netlist

Transformation of synchronous netlist into asynchronous netlist

Technology mapping Place and Route Timing Verification

Page 4: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 4Lecture "Advanced Digital Design"

De-synchronization

SADT approach Design style: Bundled data Substitution of flip-flops by latches Substitution of clock by local

asynchronous controllers De-synchronized circuits ...

never halt (liveness) perform same computations as

synchronous circuit (flow-equivalence)

Page 5: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 5Lecture "Advanced Digital Design"

De-synchronizationConversion steps

1. Conversion of Flip-flops to latches D-FF separated into master/slave latches

2. Generation of delays elements for request signals matched to length of critical path of

combinational logic

3. Implementation and wiring of asynchronous latch controllers

Page 6: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

6Lecture "Advanced Digital Design"

De-synchronizationCircuit Architecture

[Cortadella et al., 06]

De-synchronized circuit

Synchronous circuit

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna

Page 7: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 7Lecture "Advanced Digital Design"

De-synchronizationAsynchronous Controllers

Controller for master/slave latches 4-phase protocol

Different controller implementations with more or less concurrency possible Non-overlapping Semi-decoupled 4-phase Fully-decoupled 4-phase De-synchronization control

More concurrency => fast pipeline More concurrency => larger controllers

Page 8: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 8Lecture "Advanced Digital Design"

De-synchronizationFlow Equivalence

Definition: Two circuits are flow-equivalent if they ... have the same set of latches For each latch, the sequence of stored

values is the same in both circuits

[Cortadella et al., 06]

Page 9: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 9Lecture "Advanced Digital Design"

De-synchronizationPros/Cons

Advantages Use of standard HDLs Use of industrial-strength synthesis tools Almost no re-education for hardware

designers necessary Simple porting of legacy designs Negligible area overhead compared to

synchronous implementation Disadvantages

1-to-1 mapping of sync. circuits can lead to sub-optimal designs

Page 10: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 10Lecture "Advanced Digital Design"

Click Elements

Published as an implementation style for data-driven compilation (Haste)

Also useful for implementing asynchronous equivalents of synchronous circuits

Uses flip-flops for storage Most elements implementable with

cells from a standard (sync) library Arbiter still required (not for SADT)

Page 11: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 11Lecture "Advanced Digital Design"

Click Elements

Page 12: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 12Lecture "Advanced Digital Design"

Null Convention LogicSynthesis

RTL Synthesis Transform VHDL/Verilog to 3NCL netlist

Netlist contains just AND & INV gates Off-the-shelf synthesis tools

NULL values are treated as “don’t care” Logic optimizations

Dual-rail expansion 3NCL netlist to 2NCL netlist DIMS implementation of AND & INV gates

Produces a delay-insenstive circuit Logic optimizations

Page 13: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 13Lecture "Advanced Digital Design"

Dual Rail NAND

DIMS implementation [Ligthart et al.,2000]

Page 14: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 14Lecture "Advanced Digital Design"

Null Convention Logic Technology Mapping

DIMS implementation inefficient Techn. mapping on threshold gates

Circuit functionality fully described by set function of DIMS implementation

DIMS smoothing: Derive boolean network representing set function

Threshold gates have specific set function Perform logic optimization and map

boolean network to available threshold gates

Page 15: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 15Lecture "Advanced Digital Design"

Dual Rail NAND

DIMS implementation Set function

[Ligthart et al.,2000]

Page 16: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 16Lecture "Advanced Digital Design"

Null Convention Logic Threshold Gates

Library of threshold gates by Theseus all unate functions with up to 4 inputs

Page 17: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 17Lecture "Advanced Digital Design"

Syntax-Directed Compilation

1-to-1 mapping of language constructs to handshake circuit components

Uses a library of highly optimized standard cell components for simpler physical synthesis and verification

Allows experienced designer to easily envision the resulting circuit but limits optimization potential

Page 18: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 18Lecture "Advanced Digital Design"

BalsaHandshake Circuits

Approx. 40 handshake components Connected over channels

Data path associated Pure control channels (no data transferred) Active ports initiate communication Passive ports respond to request

Push channel Data flow from active to passive port

Pull channel Data flow from passive to active port

Page 19: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 19Lecture "Advanced Digital Design"

Example: Handshake Components

Fetch () Transfers data upon request

Case (@) Conditional control flow element

Source: [Balsa Manual]

Page 20: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 20Lecture "Advanced Digital Design"

Example:Modulo-10 Counter

import [balsa.types.basic]

type C_size is nibbleconstant max_count = 9

procedure count10(sync aclk; output count: C_size) is variable count_reg : C_size variable tmp : C_sizebegin loop sync aclk; if count_reg /= max_count then tmp := (count_reg + 1 as C_size) else tmp := 0 end || count <- count_reg ; count_reg := tmp end -- loopend -- begin

Page 21: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner / TU Vienna 21Lecture "Advanced Digital Design"

Example:Modulo-10 Counter

Source: [Balsa Manual]

Page 22: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 22Lecture "Advanced Digital Design"

Martin synthesis

The so-called Martin synthesis process is seminal work of the async group around A. J. Martin at Caltech

Design entry is CHP, result is PRS Performs several transformations with

designer modifiable intermediate steps

Page 23: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 24Lecture "Advanced Digital Design"

Process Decomposition

First transformation Reduces processes with complex

control structures to simple concurrent subprocesses

Either syntax-directed (SDD) or data-driven (DDD)

Page 24: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 25Lecture "Advanced Digital Design"

Syntax Directed Decomposition

Rule: A process P with construct S can be replaced with processes P1, P2 and a new channel C by replacing S with the communication C and creating P2 of the form *[[#C -> S; C]]

E.g. P: *[A; *[B1 -> S1 [] B2 -> S2]; B]

P1: *[A; C; B]P2: *[[#C & B1 -> S1

[]#C & B2 -> S2 []#C & ~B1 & ~B2 -> C]]

Page 25: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 26Lecture "Advanced Digital Design"

Data Driven Decomposition

More fine-grained than SDD At the end, clustering can be

performed to merge subprocesses again for better performance

First transformation to dynamic single assignment (DSA) form:Each variable can be written only once in

each main loop iteration, e.g.:*[A?a; X!a; B?a; Y!a]*[A?a1; X!a1; B?a2; Y!a2]

Page 26: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 27Lecture "Advanced Digital Design"

Data Driven Decomposition (2) Second transformation is projection First, transformations to allow projection e.g.

variable duplication and channel addition:*[A?a; x := a, y := ~a; X!x, Y!y]*[A?a; a1 := a, a2 := a; x := a1, y := ~a2; X!x, Y!y]*[A?a; {Ax!a, Ax?a1}, {Ay!a, Ay?a2}; x := a1, y := ~a2; X!x, Y!y]

Then projection to some sets of assignmentsSets: {A?, a, Ax!, Ay!} {Ax?, a1, x, X!} {Ay?, a2, y,

Y!}Projection: *[A?a; Ax!a, Ay!a],*[Ax?a1; x := a1; X!x], *[Ay?a2; y := ~a2; Y!y]

Page 27: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 28Lecture "Advanced Digital Design"

Handshake Expansion (HSE)

Each communication channel is replaced by handshake signals, e.g.:*[…; C; …], *[#C -> …; C]is transformed to (4-phase handshake)*[…; r := 1; [a]; r := 0; [~a]; …],*[r -> …; a := 1; [~r]; a := 0]

Reshuffling can then be used to increase concurrency/performance (different handshake controllers)

Page 28: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 29Lecture "Advanced Digital Design"

Production Rule Expansion (PRE)

Transforms HSE to PR in three steps: State variable insertion PR generation Symmetrisation

Sequencing must be implemented explicitly

*[[Lr]; Rr := 1; [Ra]; Rr := 0; [~Ra]; La := 1; [~Lr]; La := 0]

Lr -> Rr+Ra -> Rr-

~Ra -> La+~Lr -> La-

Page 29: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 30Lecture "Advanced Digital Design"

Production Rule Expansion (PRE)

Transforms HSE to PR in three steps: State variable insertion PR generation Symmetrisation

Sequencing must be implemented explicitly

*[[Lr]; Rr := 1; [Ra]; Rr := 0; [~Ra]; La := 1; [~Lr]; La := 0]*[[Lr]; Rr := 1; [Ra]; x := 1; [x]; Rr := 0; [~Ra]; La := 1; [~Lr]; x := 0; [~x]; La := 0]

~x & Lr -> Rr+Ra -> x+x -> Rr-

x & ~Ra -> La+~Lr -> x-~x -> La-

Page 30: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 31Lecture "Advanced Digital Design"

Production Rule Expansion (PRE)

Transforms HSE to PR in three steps: State variable insertion PR generation Symmetrisation

Sequencing must be implemented explicitly

*[[Lr]; Rr := 1; [Ra]; Rr := 0; [~Ra]; La := 1; [~Lr]; La := 0]*[[Lr]; Rr := 1; [Ra]; x := 1; [x]; Rr := 0; [~Ra]; La := 1; [~Lr]; x := 0; [~x]; La := 0]

~x & Lr -> Rr+Ra -> x+

~Lr | x -> Rr-x & ~Ra -> La+

~Lr -> x-Ra | ~x -> La-

Page 31: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 32Lecture "Advanced Digital Design"

Summary

Synchronous-Asynchronous Direct Translation

Synthesis with standard tools Syncronous-Asynchronous transformation

Martin Synthesis Process decomposition Handshake expansion Production rule expanstion

Page 32: Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 33Lecture "Advanced Digital Design"

References Jordi Cortadella, Alex Kondratyev, Luciano Lavagno,

Christos P. Sotiriou. Desynchronization: Synthesis of Asynchronous Circuits From Synchronous Specifications. 2006

Alain J. Martin. Programming in VLSI: From Communicating Processes to Self-timed VLSI Circuits. 1987

Catherine G. Wong and Alain J. Martin. High-Level Synthesis of Asynchronous Systems by Data-Driven Decomposition. 2003

Ad Peeters, Frank te Beest, Mark de Wit, Willem Mallon. Click Elements – An Implementation Style for Data-Driven Compilation. 2010