61
Lava Mary Sheeran, Koen Claessen Chalmers University of Technology Satnam Singh, Xilinx

Lava Mary Sheeran, Koen Claessen Chalmers University of Technology Satnam Singh, Xilinx

  • View
    220

  • Download
    4

Embed Size (px)

Citation preview

Lava

Mary Sheeran, Koen Claessen

Chalmers University of Technology

Satnam Singh, Xilinx

Lava

Not so much a hardware description language

More a style of circuit description

Emphasises connection patterns

Think of Lego

Behaviour and Structure

f g

gf

f ->- g

Parallel Connection Patterns

f -|- g

g

f

map f

f

f

f

f

Four Sided Tiles

Column

Full Adder in Xilinx Lava

fa

fa (cin, (a,b)) = (sum, cout) where part_sum = xor (a, b) sum = xorcy (part_sum, cin) cout = muxcy (part_sum, (a, cin))

a

b

cin

cout

sum

Generic Adder

fa

fa

fa adder = col fa

Top Level

adder16Circuit = do a <- inputVec ”a” (bit_vector 15 downto 0) b <- inputVec ”b” (bit_vector 15 downto 0) (s, carry) <- adder1 (a, b) sum <- outputVec ”sum” (s++[carry]) (bit_vector 16 downto 0)

? circuit2VHDL ”add16” adder16Circuit? circuit2EDIF ”add16” adder16Circuit? circuit2Verilog ”add16” adder16Circuit

114 Lines of VHDLlibrary ieee ;use ieee.std_logic_1164.all ;entity add16 is port(a : in std_logic_vector (15 downto 0) ; b : in std_logic_vector (15 downto 0) ; c : out std_logic_vector (16 downto 0) ) ;end entity add16 ;

library ieee, unisim ;use ieee.std_logic_1164.all ;use unisim.vcomponents.all ;architecture lava of add16 is signal lava : std_logic_vector (0 to 80) ;begin... lut2_48 : lut2 generic map (init => "0110") port map (i0 => lava(5), i1 => lava(21), o => lava(48)) ; xorcy_49 : xorcy port map (li => lava(48), ci => lava(47), o => lava(49)) ; muxcy_50 : muxcy port map (di => lava(5), ci => lava(47), s => lava(48), o => lava(50)) ; lut2_51 : lut2 generic map (init => "0110") port map (i0 => lava(6), i1 => lava(22), o => lava(51)) ; xorcy_52 : xorcy port map (li => lava(51), ci => lava(50), o => lava(52)) ; muxcy_53 : muxcy port map (di => lava(6), ci => lava(50), s => lava(51), o => lava(53)) ; lut2_54 : lut2 generic map (init => "0110") port map (i0 => lava(7), i1 => lava(23), o => lava(54)) ;...

EDIF...(edif add16 (edifVersion 2 0 0) (edifLevel 0) (keywordMap (keywordLevel 0)) (status (written (timeStamp 2000 11 19 15 39 43) (program "Lava" (Version "2000.14")) (dataOrigin "Xilinx-Lava") (author "Xilinx Inc.") ) )... (instance lut2_78 (viewRef prim (cellRef lut2 (libraryRef lava_virtex_lib)) ) (property INIT (string "6")) (property RLOC (string "R-7C0.S1")) )… (net lava_bit38 (joined (portRef o (instanceRef muxcy_38)) (portRef ci (instanceRef muxcy_41)) (portRef ci (instanceRef xorcy_40)) ) )

Xilinx FPGA Implementation

• 16-bit implementation on a XCV300 FPGA

• Vertical layout required to exploit fast carry chain

• No need to specify coordinates in HDL code

16-bit Adder Layout

Four adder trees

No Layout Information

Full Adder in Chalmers Lava

fa (cin, (a,b)) = (sum, cout) where part_sum = xor2 (a, b) sum = xor2 (part_sum, cin) cout = mux (part_sum, (a, cin))

faa b

cin

cout

sum

import Lava

fa (cin,(a,b)) = (sum,cout) where part_sum = xor2(a,b) sum = xor2(part_sum,cin) cout = mux(part_sum,(a,cin))

Main> simulate fa (high,(high,high))(high,high)

import Lavaimport Patterns

fa (cin,(a,b)) = (sum,cout) where part_sum = xor2(a,b) sum = xor2(part_sum,cin) cout = mux(part_sum,(a,cin))

add = row fa

Main> simulate add (low,[(low,high),(high,low),(low,high)])([high,high,high],low)

import Lavaimport Arithmetic

fa (cin,(a,b)) = (sum,cout) where …

checkFullAdd ins = ok where out1 = fa ins out2 = fullAdd ins -- Lava built-in fullAdder ok = out1 <==> out2

Main> vis checkFullAddVis: ... (t=0.7) Valid.

.model circuit.inputs i0.inputs i1.inputs i2.outputs good.table -> low0.latch low initt.reset initt1.table -> w21.table i0 -> w90 01 1.table i1 -> w100 01 1.table w9 w10 -> w80 0 00 1 11 0 11 1 0

…..

.table initt w2 w1_x -> w11 - - =w20 - - =w1_x.table w1 -> good0 01 1.end

In file Verify/circuit.mv

import Lavaimport Arithmetic

fa (cin,(a,b)) = (sum,cout) where …

checkFullAdd ins = ok where out1 = fa ins out2 = fullAdd ins -- Lava built-in fullAdder ok = out1 <==> out2

Main> satzoo checkFullAddVis: ... (t=0.1) Valid.

c Generated by Lava2000c c i0 : 6c i1 : 7c i2 : 8p cnf 21 57-5 6 7 0-5 -6 -7 05 -6 7 05 -7 6 0-4 5 8 0-4 -5 -8 04 -5 8 04 -8 5 0

…..11 -12 -21 0-11 12 0-11 21 01 -2 -11 0-1 2 0-1 11 0-1 0

In file Verify/circuit.cnf In file Verify/circuit.cnf.out

Parsing DIMACSSolving (randomize) 10000/38 (0.00 %).Computing static variable order.restarts : 0conflicts : 10learnt_clauses : 5forgotten_clauses : 3decisions : 10propagations : 79inspect_binary : 49inspect_normal : 194inspect_learnt : 2CPU time : 0 s

UNSATISFIABLE

real 0.1user 0.0sys 0.0

Equivalence Checking

F

G

Equal?

View as property checking

F

G

Equal?

Synchronous Observer

• Only one language (so easier to use)

• Safety properties

• Used in verification of control programs

FProp

ok

Different styles

deland (a,b) = c where newa = delay low a newb = delay low b c = and2(newa,newb)

deland1 = (delay low -|- delay low) ->- and2

deland2 = delay (low,low) ->- and2

deland3 = delay zero ->- and2

Simulating sequential circuits

Main> simulateSeq deland [(low,low),(high,low),(high,high),(low,low)] [low,low,low,high]

Main> simulateSeq deland2 [(low,low),(high,low),(high,high),(low,low)][low,low,low,high]

Checking equivalenceanddel = and2 ->- delay low

prop_Equivalent circ1 circ2 a = ok where out1 = circ1 a out2 = circ2 a ok = out1 <==> out2

Main> vis (prop_Equivalent deland anddel)Vis: ... (t=0.2) Valid.

Main> smv (prop_Equivalent deland3 anddel)Smv: ... (t=0.4) Valid.

In Verify/circuit.smv

-- Generated by Lava2000

MODULE mainVAR w1 : boolean;VAR w2 : boolean;VAR w3 : boolean;VAR w4 : boolean;VAR w5 : boolean;VAR w6 : boolean;VAR i0 : boolean;VAR w7 : boolean;VAR w8 : boolean;VAR i1 : boolean;VAR w9 : boolean;VAR w10 : boolean;

DEFINE w5 := 0;DEFINE w6 := i0;ASSIGN init(w4) := w5;ASSIGN next(w4) := w6;DEFINE w8 := i1;ASSIGN init(w7) := w5;ASSIGN next(w7) := w8;DEFINE w3 := w4 & w7;DEFINE w10 := w6 & w8;ASSIGN init(w9) := w5;ASSIGN next(w9) := w10;DEFINE w2 := !(w3 <-> w9);DEFINE w1 := !(w2);SPEC AG w1

Many delaysdelayN 0 init = iddelayN n init = delay init ->- delayN (n-1) init

dAnd n = delayN n (low,low) ->- and2andD n = and2 ->- delayN n low

Main> smv (prop_Equivalent (dAnd 10) (andD 10))Smv: ... (t=0.9) Valid.

Main> smv (prop_Equivalent (dAnd 15) (andD 15))Smv: ... (t=2:30.3) Valid.

Same verification for 20 takes more than 35 minutes

Note

Could be viewed as Lustre (or similar) embedded in Haskell

Generic circuits and connection patterns easy to describe (the power of Haskell)

Verify FIXED SIZE circuits (squeezing the problem down into an easy enough one)

Working on lists

G

F

parl F G = halveList ->- (F -|- G) ->- append

two f

f

f

two (two f)

f

f

f

f

Many twos

twoN 0 circ = circ

twoN n circ = two (twoN (n-1) circ)

Interleave

f

f

ilv f

unriffle ->- two f ->- riffle

Many interleaves

ilv (ilv (ilv C))

Many interleaves

ilvN 0 circ = circ

ilvN n circ = ilv (ilvN (n-1) circ)

Wiring

id2

swap

Butterfly

bfly circ

bfly circ

Defining Butterfly

bfly 0 circ = id

bfly n circ = ilvN (n-1) circ ->- two (bfly (n-1) circ)

Butterfly Layout on an FPGA

Bitonic merger

compInt [x,y] = [imin(x,y), imax(x,y)]

compBit [x,y] = [and2(x,y), or2(x,y)]

Main> simulate (bfly 3 compInt) [1,3,5,7,8,6,4,2]

[1,2,3,4,5,6,7,8]

Main> simulate (bfly 2 compBit) [low,high,low,high]

[low,high,low,high]

Describe connection pattern

Then plug in variety of components, including bit-serial etc. (see work of Vuillemin)

Sorters and mergers verified using 0-1 principle and SAT-solver (see Knuth vol. 3)

Used higher order functions and polymorphism

34

54

32

Fast Adder

carries

Reduction tree for multiplier

multiply (as,bs) = p1:ss where ([p1]:[p2,p3]:ps) = prods_by_weight (as,bs) is = redArray ps ss = binaryAdder ([p2,p3]:is)

redArray = addEmpty ->- row compress ->- first

compress

f-cell

other cases

h1-cell insCh-cell

carries

34

5

53

2

6 4

possible f-cell

fullAdd

halfAdd cells similar. Gives shortest wires multiplier. Not great!

But we just need to vary these!

insC

insS

fullAdd

Dadda

excellent multiplier, but famous for incomprehensibility, irregularity

fullAdd

Regular reduction tree (Eriksson et al, CE)

Nowhere near as good as Dadda, but inspired this work

fullAdd

fullAdd

Idea: Harden the wiring during circuit generation using clever circuits. Shadow values estimate delay through wires and cells.

fullAdd iddown

cleverInsert

cleverInsert

cleverInsert = row cswap ->- apr

forms necessary wiring based on context (delays on shadow wires)

redArrayD hAdd fAdd iddown hds fds d ps = ((redArrayW (combine hAdd (halfAddDelI hds)) (combine fAdd (fullAddDelI fds)) (combine iddown (iddownDelI d)) cleverInsert cleverInsert) ->- unmark) (mark ps)

combine c a = unzipp ->- par c a ->- zipp

Structure of circuit description (generator) remains unchanged

User explores designs by writing small analysis functions and playing

Main> daddatest 16[0,20],[8,45],[40,50],[45,70],[65,75],[70,90],[85,95],[90,100],[95,115],[110,120],[115,120],[115,125],[120,125],[120,140],[135,145],[140,145],[140,145],[140,140],[135,135],[130,130],[125,125],[120,120],[115,115],[110,110],[105,105],[100,90],[85,85],[80,45],[0,40]]

Main> simulate (compareW (daddatestW 16) (daddatest 16)) [] [[0,0],[2,0],[2,0],[2,2],[4,2],[4,4],[6,6],[8,6],[8,8],[10,8],[10,12],[14,12],[14,12],[14,16],[18,16],[18,16],[18,16],[18,18],[20,20],[22,22],[24,22],[24,8],[10,10],[12,12],[14,12],[14,6],[8,6],[8,0],[0,0]]

Main> satzoo (prop_mult multTDMW 8)Satzoo: ... (t=3:08.8) Valid.

Result

Simple parameterised description of fast adaptive multiplier. Promises to perform well.

Like TDM except that wire-length, and not only gate-delay is taken into account in choosing which connections to make. Description is very similar to that of basic multiplier.

All done inside Lava. Next step, go the whole hog.

Current uses of Lava

All at low level of design

Teaching and research in formal verification at Chalmers

FPGA cores at Xilinx

Our research on Wired builds upon Lava

Summary

Circuit generators are short and sweetFormal verification of fixed size circuitsClever circuits a good idiomSimple wire and gate delay modelling components

can guide synthesis (could be made fancier, collaboration needed)

Design exploration uses Haskell as scripting language (an unexpected bonus)

Need links to lower level toolsConcentration on datapaths