Upload
dore
View
31
Download
0
Embed Size (px)
DESCRIPTION
Direct synthesis of large-scale asynchronous controllers using a Petri-net-based approach. Ivan BlunnoPolitecnico di Torino Alex BystrovUniv. Newcastle upon Tyne Josep CarmonaUniv. Politècnica de Catalunya Jordi CortadellaUniv. Politècnica de Catalunya Luciano LavagnoUniversità di Udine - PowerPoint PPT Presentation
Citation preview
Direct synthesis of large-scale asynchronous controllers using
a Petri-net-based approach
Ivan Blunno Politecnico di Torino
Alex Bystrov Univ. Newcastle upon Tyne
Josep Carmona Univ. Politècnica de Catalunya
Jordi Cortadella Univ. Politècnica de Catalunya
Luciano Lavagno Università di Udine
Alex Yakovlev Univ. Newcastle upon Tyne
Outline
Motivation• Design flow• Verilog HDL specification• Petri nets and trace expressions• Synthesis process• Conclusion
Motivation
• Language-based design key enabler to synchronous logic success
• Use HDL as single language for• specification• logic simulation and debugging• synthesis• post-layout simulation
• HDL must support multiple levels of abstraction
Motivation
• HDL generates large asynchronous controllers: need direct synthesis
• Guarantee an implementation• Automatic exploration of the design space• Benefit from existing structural methods for
logic synthesis• Benefit (at the design stage) from existing
performance estimation approaches
Design flow
Control/datasplitting
STG(control)
HDLspecification
SynthesizableHDL (data)
Synthesis(petrify)
Timing analysis(Synopsys)
HDLimplementation
Synthesis(Synopsys)
Logicimplementation
Delayinsertion
Logic delays
Design flow
• What is available?• simulators (no synchronous assumption…)• logic synthesis (from BFSM, STG, …)• layout (almost like synchronous…)
• What is missing?• translator from HDL to synthesis
specification model• translator from synthesis implementation
model to HDL
Other approaches
• Special-purpose languages• pros: syntax and semantics can be tailored
to asynchronous Models of Computation (STG, BFSM, process algebrae)
• cons: not familiar to designers,no standard tool support
• Examples• Tangram• Communicating Hardware Processes• Balsa
Our approach
• General-purpose language• pros: several tools available, broad user basis• cons: syntax and semantics oriented to gates,
(not STGs or BFSMs or process algebrae)• need to define a subset for synthesis (full
language only good for simulation)• Choice
• VHDL• Verilog [Blunno & Lavagno, ASYNC’00]
Outline
• Motivation• Design flow Verilog HDL specification• Petri nets and trace expressions• Synthesis• Conclusion
Asynchronous Verilog subset• Module and signal declaration:
• module example(a, b, c, d);• input a, b[7..0];• output c, d;• reg e, f, g[11..0];
• Currently only single module supported• always loop surrounds live behavior• initial block defines initialization sequence
Asynchronous Verilog subset
• Transitions:• input signals: wait statement
• wait(a); ... wait (!b);• output signals: assignment statement
• c = a + b;• Each statement generates a trace
expression and a datapath fragment
Asynchronous Verilog subset
• Causality relations: Verilog statements• begin-end for sequencing• fork-join for concurrency• if-then-else for input choice
• Only structured mix of sequencing, concurrency and choice can be specified
Example: simple filteralways begin
wait(start);R = SMP * 3;RES = SMP * 4;if(b7 == 1) RES = 0;else begin if(b6 == 1) RES = 1;end;done = 1;wait(!start);done = 0;
end
Control-data partitioning
• Splitting of asynchronous control and synchronous data path
• Automated insertion of bundling delays
CONTROLUNIT
DATAPATH
delay
request
acknowledge
Outline
• Motivation• Design flow• Verilog HDL specification Petri nets and trace expressions• Synthesis• Conclusion
ACID-WG 2000 Grenoble
Controller design flow
PNTE
Circuit
Petri Net
TransformationsReductions
Synthesis
HDL
Syntax-directed translation
Design flow
PNTE
Boolean
equationsPerformance Estimation
Area Estimation
Critical
cycles
Transformations
Cost
estimation
Structural
synthesis
PNTE
• Free-choice Petri net• Transitions are trace expressions• Trace expressions represent well-structured
event relations– Causality– Concurrency– Choice
Trace expressions (TE)
TE e
TE ; TE
TE || TE
TE TE�
TE
trace expressions are a subset of CCS agent expressions [Milner 80]
Trace expressions: example
( a || ( b ; c) ) || (d e)�
||
;
||
a
b c
d e
From PN to PNTE
• Reductions to simplify the net structure
• Concurrency relations take– O(n2) in Trace expressions– O(n3) in Free-Choice systems
[Kovalyov & Esparza]
Reductions
TE1
TE2
TE1 ; TE2
Reductions
TE1 TE1 || TE2TE2
Examplea
f b
c
d g
h
e
d;a; ( b || f )
c
g; h;e
Outline
• Motivation• Design flow• Verilog HDL specification• Petri nets and trace expressions Synthesis • Conclusion
Exploration of the design space
• Kit of transformations at Petri net – Concurrency reduction – Increase of concurrency– Event hiding
• Fast cost estimation– Area (Boolean equations)– Performance (critical cycles)
Transformations at the net levelConcurrency reduction
a
f b
c
d
f and b are concurrent !
Transformations at the net levelConcurrency reduction
a
f b
c
d
f and b are ordered !
Transformations at the net levelConcurrency reduction in TE
a
f b
c
d
;
||
a
b c df
;
;
Concurrency in TE:
b and f have a common
parallel antecessor
;
||
a
b c df
;
;
Transformations at the net levelConcurrency reduction in TE
a
f b
c
d
Concurrency reduction:
change the parallelizer
by a sequencer
;
Transformations at the net levelIncrease of concurrency
a
f b
c
d
c is ordered with f and b!
Transformations at the net levelIncrease of concurrency
a
f b c
d
c, f and b are concurrent!
Transformations at the net levelIncrease of concurrency in TE
a
f b
c
d
;
||
a
b c df
;
;
Increase of concurrency:
reorganizing the subtree
Transformations at the net levelIncrease of concurrency in TE
a
f b
c
d
Increase of concurrency:
reorganizing the subtree
;
||
a
b c df
;
; d
c
Transformations at the net levelIncrease of concurrency in TE
a
f b
c
d
;
aIncrease of concurrency:
reorganizing the subtree;
b
||
cf
||
d
Transformations at the net levelEvent hiding
a
f b
c
d
hiding of b !
Transformations at the net level
a
f
c
d
b hidden !
Event hiding
Transformations at the net level
a
f b
c
d
;
||
a
b c df
;
;
Event hiding :
delete the corresponding
leaf ...
Event hiding in TE
Transformations at the net level
a
f b
c
d
;
a
c d
;
;||
f
Event hiding :
delete the corresponding
leaf ...
Event hiding in TE
||
f
Transformations at the net level
a
f b
c
d
;
a
c d
;
; f
Event hiding :
delete the corresponding
leaf ... and simplify the
tree structure
Event hiding in TE
Synthesis of control logic
For large-scale controllers:
• Direct translation from Petri Net (or STG-h/s-refined) specifications
• Logic synthesis from fully refined STGs with
pseudo-one-hot encoding, structural techniques and STG-level optimisations
Why direct translation?
• Logic synthesis has problems with state space explosion, repetitive and regular structures (log-based encoding approach)
• Direct translation has linear complexity but can be area inefficient (inherent one-hot encoding)
What about performance?
Shifter Example
(x:=y;y:=a)* [Bystrov at al, 6th UK Async Forum,’99]
Control Logic option Speed (ns)Refined STG directly synthesized by Petrify 5.4
Circuit decomposition with two D-elements 4.2
Circuit decomposition and Petrify re-synthesis 3.3
Re-synthesis with relative timing 1.7
Direct Translation of Petri Nets
• Previous work dates back to 70s• Synthesis into event-based (2-phase) circuits
(similar to micropipeline control)– S.Patil, F.Furtek (MIT)
• Synthesis into level-based (4-phase) circuits (similar to synthesis from one-hot encoded FSMs)– R. David (’69, translation FSM graphs to CUSA cells)– L. Hollaar (’82, translation from parallel flowcharts)– V. Varshavsky et al. (’90,’96, translation from PN into
an interconnection of David Cells)
David’s original approach
a
b
c
d
x1 x’2
x’1
x2 ya
yc
yb
x’2
x1
Fragment of flow graph CUSA for storing state b
Hollaar’s approach
K
L
A
B
K
N
M
L
N
Fragment of flow-chart One-hot circuit cell
A B
(0) (1)
11
(1)
(1)
(1)
(1)
M
Hollaar’s approach
K
L
MA
B
K
N
M
L
N
Fragment of flow-chart One-hot circuit cell
A B1
1
0
(1)
(1)
(1)
01
Hollaar’s approach
K
L
MA
B
K
N
M
L
N
Fragment of flow-chart One-hot circuit cell
A B0
1
1
(1)
(1)
(1)
01
Varshavsky’s Approachp1 p2
p1 p2
(1) (0) (0) (1)
1*(1)
OperationControlled
To Operation
Varshavsky’s Approachp1 p2
p1 p2
(1) (0) 0->1 1->0
1->0 (1)
Varshavsky’s Approachp1 p2
p1 p21->0 0->1 0->1 1->0
1->0->1 1*
Translation in brief
This method has been used for designing control of a token ring adaptor [Yakovlev et al.,Async. Design Methods, 1995]
The size of control was about 80 David Cells with 50 controlled hand shakes
Direct translation examples
In this work we tried direct translation:
• From STG-refined specification (VME bus controller)
– Worse than logic synthesis
• From a largish abstract specification with high degree of repetition (mod-6 counter)
– Considerable gain to logic synthesis
• From a small concurrent specification with dense coding space (“butterfly” circuit)
– Similar or better than logic synthesisb
Example 1: VME bus controller
INPUTS: DSr,DSw,LDTACKOUTPUTS: D,LDS,DTACK
p0
DSr+DSw+
LDS+D+/1
DTACK-
p1
LDTACK+
LDS+/1
D+
DTACK+
DSr-
D-
p2
LDS- DSw-
LDTACK- DTACK+/1
p3 D-/1
LDTACK+/1
DTACK-DSr+
DSw+
LDS+/1LDTACK+/1
D+/1reqD+/1ack
DTACK+/1DSr-
D+/2reqD+/2ack
LDS+/2LDTACK+/2
D-/2reqD-/2ack
DTACK+/2DSw-
D-/1ack
D-/1req+
+
+
+
&
&
&
p1
pr1pr3pr2
pw1pw2
p4
p2
pr4
pw3pw4
LDS-LDTACK-
10
01 01 01 01
0110
01 01 01 01
1* 1*
(1)
(1)
(1)
(1)
(1)
(1)(1)
(1) (1) (1) (1) (1)
(1)
(1)(1)
(1)
(1) (1) (1) (1)
(1)
(1) (1) (1)
(1) (1)
&
(1)
(1)(0)(0)
Result of direct translation (DC unoptimised):
VME bus controller
D+/2reqD+/2ack
(1)
LDS+/2LDTACK+/2
D-/2reqD-/2ack
DTACK+/2DSw-
DTACK+/1DSr-
D-/1reqD-/1ack
DTACK-DSr+
DSw+
LDS+/1LDTACK+/1
+p1
pr1
10
01(1) (1)
(1) (1)
(1)
(1)
(1)
01
(1)
D+/1reqD+/1ack(1)
&
pr2
pw1pw2
&
LDS-LDTACK-
01
(1)
p2
+
&
+
(1)
1*
(1)
1*
&
&+
(1)
(1)
(1)
(1)
(1)
(1)
(1)
&
After DC-optimisation (in the style of Varshavsky et al WODES’96)
David Cell library
10
1
1
1
1
1
00
p
01
0
1
1
1
1
1
1
p
01
+
p
01
+
01
+p
+
p
01
&
10
1 1
1
1
10
1
1
1
1
1
1
1
1
1 1
1
1
1
1
1
1
1
1
DC1
DC2
DC3
DC4
DC5
DC6
&
p
01
1
1
1
1
0
01
&
&+
p
DC7
+ 1
111
1
VME bus controller
D+/2reqD+/2ack
(1)
LDS+/2LDTACK+/2
D-/2reqD-/2ack
DTACK+/2DSw-
DTACK+/1DSr-
D-/1reqD-/1ack
DTACK-DSr+
DSw+
LDS+/1LDTACK+/1
+p1
pr1
10
01(1) (1)
(1) (1)
(1)
(1)
(1)
01
(1)
D+/1reqD+/1ack(1)
&
pr2
pw1pw2
&
LDS-LDTACK-
01
(1)
p2
+
&
+
(1)
1*
(1)
1*
&
&+
(1)
(1)
(1)
(1)
(1)
(1)
(1)
&
After DC-optimisation (in the style of Varshavsky et al WODES’96)
“Data path” control logic
DTACK+/1(r)
DTACK+/2(w)
DTACK-
DTACK DSr DSw
3 wire
h/s
DSr-DSw-
DSr+DSw+
DTACK+/1(r)
DTACK+/2(w)
DTACK-
DTACK DSr DSw
(1)
(1)
(1)
DSr-
DSw-
DSr+
DSw+
(1)
(1)
(1)
(1)
(1)
(1)
(1) (0) (0)
DTACK- DSR/DSw handshake:
Example of interface with a handshake control (DTACK, DSR/DSW):
Ex 2: “Flat” mod-6 Counter
TE-like Specification:
((p?;q!)5;p?;c!)*Petri net (5-safe):
p?
c!
q! 5
5
“Flat” mod-6 CounterRefined (by hand) and optimised (by Petrify) Petri net:
“Flat” mod-6 counterResult of direct translation (optimised by hand):
David Cells and Timed circuits
(a) Speed-independent (b) With Relative Timing
“Flat” mod-6 counter
(a) speed-independent (b) with relative timing
“Butterfly” circuit
a+ a-
b-
dummy
b+
Initial Specification: STG after CSC resolution:
a+
a-
b+
b-
x+
x-
y+
y-
z+
z-
“Butterfly” circuit
x
y z
(0)
(0) (0)
(0)
a
0* 0*
b
(0)
(1)
(1)
(1)(1) (1)
Speed-independent logic synthesis solution:
“Butterfly” circuit
a+
b+ &
&
b-
a-1*
1*
(10)
(10)
(01)
(01)
(01)
(1)
(1)
(1)
(1) (1)
(0)
(0)
(1) (1)
(1)
(1)
(1)
Speed-independent DC-circuit:
“Butterfly” circuit
DC-circuit with aggressive relative timing:
(1)
(1)
(1)
(1)
(1)
(0)
(0)
(0) (1)
(1)
pa1 pa1n
pb1 pb1n
(0) (1)
a anpa2 pa2n
p pn
ta1
tb1
(0) (1)
pb2 pb2n
b bn
1*
1*
(1)
Comparison with logic synthesis
Example Logic synthesis
DC-translation
VME-bus(overall operation cycle)
6ns 11ns
Mod-6 count(p->q/c, worst case cycle)
>5ns 1.6ns
Butterfly(with RT, operation cycle)
2ns 1.8ns
DC control with Relative Timing
DC DCDC
op1 op2
DC control with Relative Timing
DC DCDC
op1 op2
David Cell type Token shift time
Speed-independent 1.2ns
Mild RT (fast bkwd reset) 0.8ns
Aggressive RT (fast fwd set) 0.4ns
Synthesis
• Encoding based on a David-cell approach
• Transformations to improve area and performance
• Structural methods to derive a circuit [Pastor et al.] Transactions on CAD, Nov’98
Synthesisx+
z+
z-
y-
x-
y+
p1
p2
p3
p4
p5
p6
p7
Next-state functionof signal y ?
000
1-0
1-1
0-1
-0-
-1-
010
Synthesisx+
z+
z-
y-
x-
y+
p1
p2
p3
p4
p5
p6
p7
Next-state functionof signal y ?
000
1-0
1-1
0-1
10--01
11--11
010
y = x + z
Synthesis example: VME bus
DeviceLDS
LDTACK
D
DSr
DSw
DTACK
VME BusController
DataTransceiver
BusDSr
LDS
LDTACK
D
DTACK
Read Cycle
Synthesis example: VME bus
LDTACK+
D+
DTACK+
DSr-
D-
DTACK- LDS-
LDTACK-DSr+
LDS+
READ CYCLE SPECIFICATION
LDTACK+
D+
DTACK+
DSr-
D-
DTACK-LDS-
LDTACK-
DSr+
LDS+
csc0-
csc0+
PETRIFY( Optimizing Performance )
Synthesis example: VME bus p2+
ldtack+
p8- p11-
lds+
p1+
d+
p3+
p1-
p2-
p4+
dtack+
p3-
p5+
dsr-
p4-
p9+p6+
d- p5-
p10+ p7+
lds- dtack-
p9- p6-
p11+
ldtack- p8+
dsr+p10-
p7-
LDTACK+
D+
DTACK+
DSr-
D-
DTACK- LDS-
LDTACK-DSr+
LDS+
Synthesis example: VME bus p2+
ldtack+
p8- p11-
lds+
p1+
d+
p3+
p1-
p2-
p4+
dtack+
p3-
p5+
dsr-
p4-
p9+p6+
d- p5-
p10+ p7+
lds- dtack-
p9- p6-
p11+
ldtack- p8+
dsr+p10-
p7-
ldtack+
lds+d+
dtack+
dsr-
p9+
d-
lds- dtack-
p9-ldtack-
dsr+
Synthesis example: VME bus
ldtack+
lds+d+
dtack+
dsr- p9+
d-
lds- dtack-
p9-ldtack-
dsr+
ldtack+
lds+d+
dtack+
dsr-
p9+
d-
lds- dtack-
p9-ldtack-
dsr+
Cost estimation
• Heuristics:– AREA :
{ # literals in each Excitacion Region}
– PERFORMANCE : length of critical cycle in the net
• Exploration of the design space guided by cost estimations
Performance estimation: critical cycles
e
a
b
c
d
f
g
h i
j
k
e
a
b
c
d
f
g
h i
j
k
Marked-Graph Decomposition
Conclusions
• Fully automated design flow– From HDLs (control / data splitting)– Existing tools for data-path synthesis– Direct synthesis guarantees implementation
(HDL Petri net, Petri-net-based encoding)– Synthesis of large controllers by efficient spec
models (Free-choice Petri nets + trace expressions)– Exploration of the design space (optimization) by
property-preserving transformations– Logic synthesis by structural methods