View
220
Download
4
Embed Size (px)
Citation preview
Lava
Not so much a hardware description language
More a style of circuit description
Emphasises connection patterns
Think of Lego
Full Adder in Xilinx Lava
fa
fa (cin, (a,b)) = (sum, cout) where part_sum = xor (a, b) sum = xorcy (part_sum, cin) cout = muxcy (part_sum, (a, cin))
a
b
cin
cout
sum
Top Level
adder16Circuit = do a <- inputVec ”a” (bit_vector 15 downto 0) b <- inputVec ”b” (bit_vector 15 downto 0) (s, carry) <- adder1 (a, b) sum <- outputVec ”sum” (s++[carry]) (bit_vector 16 downto 0)
? circuit2VHDL ”add16” adder16Circuit? circuit2EDIF ”add16” adder16Circuit? circuit2Verilog ”add16” adder16Circuit
114 Lines of VHDLlibrary ieee ;use ieee.std_logic_1164.all ;entity add16 is port(a : in std_logic_vector (15 downto 0) ; b : in std_logic_vector (15 downto 0) ; c : out std_logic_vector (16 downto 0) ) ;end entity add16 ;
library ieee, unisim ;use ieee.std_logic_1164.all ;use unisim.vcomponents.all ;architecture lava of add16 is signal lava : std_logic_vector (0 to 80) ;begin... lut2_48 : lut2 generic map (init => "0110") port map (i0 => lava(5), i1 => lava(21), o => lava(48)) ; xorcy_49 : xorcy port map (li => lava(48), ci => lava(47), o => lava(49)) ; muxcy_50 : muxcy port map (di => lava(5), ci => lava(47), s => lava(48), o => lava(50)) ; lut2_51 : lut2 generic map (init => "0110") port map (i0 => lava(6), i1 => lava(22), o => lava(51)) ; xorcy_52 : xorcy port map (li => lava(51), ci => lava(50), o => lava(52)) ; muxcy_53 : muxcy port map (di => lava(6), ci => lava(50), s => lava(51), o => lava(53)) ; lut2_54 : lut2 generic map (init => "0110") port map (i0 => lava(7), i1 => lava(23), o => lava(54)) ;...
EDIF...(edif add16 (edifVersion 2 0 0) (edifLevel 0) (keywordMap (keywordLevel 0)) (status (written (timeStamp 2000 11 19 15 39 43) (program "Lava" (Version "2000.14")) (dataOrigin "Xilinx-Lava") (author "Xilinx Inc.") ) )... (instance lut2_78 (viewRef prim (cellRef lut2 (libraryRef lava_virtex_lib)) ) (property INIT (string "6")) (property RLOC (string "R-7C0.S1")) )… (net lava_bit38 (joined (portRef o (instanceRef muxcy_38)) (portRef ci (instanceRef muxcy_41)) (portRef ci (instanceRef xorcy_40)) ) )
Xilinx FPGA Implementation
• 16-bit implementation on a XCV300 FPGA
• Vertical layout required to exploit fast carry chain
• No need to specify coordinates in HDL code
Full Adder in Chalmers Lava
fa (cin, (a,b)) = (sum, cout) where part_sum = xor2 (a, b) sum = xor2 (part_sum, cin) cout = mux (part_sum, (a, cin))
faa b
cin
cout
sum
import Lava
fa (cin,(a,b)) = (sum,cout) where part_sum = xor2(a,b) sum = xor2(part_sum,cin) cout = mux(part_sum,(a,cin))
Main> simulate fa (high,(high,high))(high,high)
import Lavaimport Patterns
fa (cin,(a,b)) = (sum,cout) where part_sum = xor2(a,b) sum = xor2(part_sum,cin) cout = mux(part_sum,(a,cin))
add = row fa
Main> simulate add (low,[(low,high),(high,low),(low,high)])([high,high,high],low)
import Lavaimport Arithmetic
fa (cin,(a,b)) = (sum,cout) where …
checkFullAdd ins = ok where out1 = fa ins out2 = fullAdd ins -- Lava built-in fullAdder ok = out1 <==> out2
Main> vis checkFullAddVis: ... (t=0.7) Valid.
.model circuit.inputs i0.inputs i1.inputs i2.outputs good.table -> low0.latch low initt.reset initt1.table -> w21.table i0 -> w90 01 1.table i1 -> w100 01 1.table w9 w10 -> w80 0 00 1 11 0 11 1 0
…..
.table initt w2 w1_x -> w11 - - =w20 - - =w1_x.table w1 -> good0 01 1.end
In file Verify/circuit.mv
import Lavaimport Arithmetic
fa (cin,(a,b)) = (sum,cout) where …
checkFullAdd ins = ok where out1 = fa ins out2 = fullAdd ins -- Lava built-in fullAdder ok = out1 <==> out2
Main> satzoo checkFullAddVis: ... (t=0.1) Valid.
c Generated by Lava2000c c i0 : 6c i1 : 7c i2 : 8p cnf 21 57-5 6 7 0-5 -6 -7 05 -6 7 05 -7 6 0-4 5 8 0-4 -5 -8 04 -5 8 04 -8 5 0
…..11 -12 -21 0-11 12 0-11 21 01 -2 -11 0-1 2 0-1 11 0-1 0
In file Verify/circuit.cnf In file Verify/circuit.cnf.out
Parsing DIMACSSolving (randomize) 10000/38 (0.00 %).Computing static variable order.restarts : 0conflicts : 10learnt_clauses : 5forgotten_clauses : 3decisions : 10propagations : 79inspect_binary : 49inspect_normal : 194inspect_learnt : 2CPU time : 0 s
UNSATISFIABLE
real 0.1user 0.0sys 0.0
Synchronous Observer
• Only one language (so easier to use)
• Safety properties
• Used in verification of control programs
FProp
ok
Different styles
deland (a,b) = c where newa = delay low a newb = delay low b c = and2(newa,newb)
deland1 = (delay low -|- delay low) ->- and2
deland2 = delay (low,low) ->- and2
deland3 = delay zero ->- and2
Simulating sequential circuits
Main> simulateSeq deland [(low,low),(high,low),(high,high),(low,low)] [low,low,low,high]
Main> simulateSeq deland2 [(low,low),(high,low),(high,high),(low,low)][low,low,low,high]
Checking equivalenceanddel = and2 ->- delay low
prop_Equivalent circ1 circ2 a = ok where out1 = circ1 a out2 = circ2 a ok = out1 <==> out2
Main> vis (prop_Equivalent deland anddel)Vis: ... (t=0.2) Valid.
Main> smv (prop_Equivalent deland3 anddel)Smv: ... (t=0.4) Valid.
In Verify/circuit.smv
-- Generated by Lava2000
MODULE mainVAR w1 : boolean;VAR w2 : boolean;VAR w3 : boolean;VAR w4 : boolean;VAR w5 : boolean;VAR w6 : boolean;VAR i0 : boolean;VAR w7 : boolean;VAR w8 : boolean;VAR i1 : boolean;VAR w9 : boolean;VAR w10 : boolean;
DEFINE w5 := 0;DEFINE w6 := i0;ASSIGN init(w4) := w5;ASSIGN next(w4) := w6;DEFINE w8 := i1;ASSIGN init(w7) := w5;ASSIGN next(w7) := w8;DEFINE w3 := w4 & w7;DEFINE w10 := w6 & w8;ASSIGN init(w9) := w5;ASSIGN next(w9) := w10;DEFINE w2 := !(w3 <-> w9);DEFINE w1 := !(w2);SPEC AG w1
Many delaysdelayN 0 init = iddelayN n init = delay init ->- delayN (n-1) init
dAnd n = delayN n (low,low) ->- and2andD n = and2 ->- delayN n low
Main> smv (prop_Equivalent (dAnd 10) (andD 10))Smv: ... (t=0.9) Valid.
Main> smv (prop_Equivalent (dAnd 15) (andD 15))Smv: ... (t=2:30.3) Valid.
Same verification for 20 takes more than 35 minutes
Note
Could be viewed as Lustre (or similar) embedded in Haskell
Generic circuits and connection patterns easy to describe (the power of Haskell)
Verify FIXED SIZE circuits (squeezing the problem down into an easy enough one)
Bitonic merger
compInt [x,y] = [imin(x,y), imax(x,y)]
compBit [x,y] = [and2(x,y), or2(x,y)]
Main> simulate (bfly 3 compInt) [1,3,5,7,8,6,4,2]
[1,2,3,4,5,6,7,8]
Main> simulate (bfly 2 compBit) [low,high,low,high]
[low,high,low,high]
Describe connection pattern
Then plug in variety of components, including bit-serial etc. (see work of Vuillemin)
Sorters and mergers verified using 0-1 principle and SAT-solver (see Knuth vol. 3)
Used higher order functions and polymorphism
multiply (as,bs) = p1:ss where ([p1]:[p2,p3]:ps) = prods_by_weight (as,bs) is = redArray ps ss = binaryAdder ([p2,p3]:is)
redArray = addEmpty ->- row compress ->- first
Regular reduction tree (Eriksson et al, CE)
Nowhere near as good as Dadda, but inspired this work
fullAdd
fullAdd
Idea: Harden the wiring during circuit generation using clever circuits. Shadow values estimate delay through wires and cells.
fullAdd iddown
cleverInsert
cleverInsert
redArrayD hAdd fAdd iddown hds fds d ps = ((redArrayW (combine hAdd (halfAddDelI hds)) (combine fAdd (fullAddDelI fds)) (combine iddown (iddownDelI d)) cleverInsert cleverInsert) ->- unmark) (mark ps)
combine c a = unzipp ->- par c a ->- zipp
Structure of circuit description (generator) remains unchanged
User explores designs by writing small analysis functions and playing
Main> daddatest 16[0,20],[8,45],[40,50],[45,70],[65,75],[70,90],[85,95],[90,100],[95,115],[110,120],[115,120],[115,125],[120,125],[120,140],[135,145],[140,145],[140,145],[140,140],[135,135],[130,130],[125,125],[120,120],[115,115],[110,110],[105,105],[100,90],[85,85],[80,45],[0,40]]
Main> simulate (compareW (daddatestW 16) (daddatest 16)) [] [[0,0],[2,0],[2,0],[2,2],[4,2],[4,4],[6,6],[8,6],[8,8],[10,8],[10,12],[14,12],[14,12],[14,16],[18,16],[18,16],[18,16],[18,18],[20,20],[22,22],[24,22],[24,8],[10,10],[12,12],[14,12],[14,6],[8,6],[8,0],[0,0]]
Main> satzoo (prop_mult multTDMW 8)Satzoo: ... (t=3:08.8) Valid.
Result
Simple parameterised description of fast adaptive multiplier. Promises to perform well.
Like TDM except that wire-length, and not only gate-delay is taken into account in choosing which connections to make. Description is very similar to that of basic multiplier.
All done inside Lava. Next step, go the whole hog.
Current uses of Lava
All at low level of design
Teaching and research in formal verification at Chalmers
FPGA cores at Xilinx
Our research on Wired builds upon Lava
Summary
Circuit generators are short and sweetFormal verification of fixed size circuitsClever circuits a good idiomSimple wire and gate delay modelling components
can guide synthesis (could be made fancier, collaboration needed)
Design exploration uses Haskell as scripting language (an unexpected bonus)
Need links to lower level toolsConcentration on datapaths