54
Synthesis of Synchronous Assertions with Guarded Atomic Actions MIT && Bluespec, Inc. guys Presented by: Suman Karumuri Andy Bartholomew

bluespec talk

  • Upload
    mansu

  • View
    1.704

  • Download
    1

Embed Size (px)

DESCRIPTION

Bluespec is a language based on haskell for designing VHDL and verilog hardware.

Citation preview

Page 1: bluespec talk

Synthesis of Synchronous Assertions with Guarded Atomic

ActionsMIT && Bluespec, Inc. guys

Presented by: Suman Karumuri

Andy Bartholomew

Page 2: bluespec talk

Life Cycle of a chip

Page 3: bluespec talk

Quarks to Parallel Universes …

• PFET/NFET• Transistor ( 2 FETS)• NAND / OR / NOT

gates.• Circuits• Modules• Integrated Circuits

(IC)• ASIC’s / Chip

Page 4: bluespec talk

Birth of a chip

• Requirements

• Design

• Coding

• Testing and simulation.

• Formal verification.

• Synthesis

25 % time

75 % time

Page 5: bluespec talk

Birth of a chip• Requirements• Design

• Coding– HDL, RTL.– Verilog HDL– VHDL

• System (transistor level)• Behavioral (expressions)• Structural (functions)• OO (Regular Languages)

– SystemC – Lava, Bluespec (High level languages).

• Testing and simulation• Formal verification.• Synthesis

Page 6: bluespec talk

Birth of a chip• Requirements• Design• Coding

• Testing and simulation– Software– Hardware

• Formal verification– Model Checking– Proving programs

• Synthesis

3% of test space

This paper

Turing Award 2008

Page 7: bluespec talk

Birth of a chip• Requirements• Design• Coding• Testing and simulation• Formal verification

• Synthesis (Burn the design on FPGA)– Chip Area– Power consumption– Minimal number of transistors– Speed

Page 8: bluespec talk

Bluespec

Page 9: bluespec talk

Motivation

• SystemC lessons– Single assignment.– No state.– No destructive assignment.– Chaining of states.– Weak Type system.

• Lava lessons.– Haskell. ( Functional, Monads, Polymorphic Type inference).– Modules.

• Bluespec– Full fledged language instead of haskell modules.– “ Behavioral model is atomic actions with guards on state”.

• Data flow model – OO for Reuse.

Page 10: bluespec talk

Bluespec -> Chip

ExtendedHaskell

TRSVerilog

Or C

RTLSynthesis

Concurrency and

atomicity

Correct Programs

TRS: Term Rewriting systemRTL: Register transfer language.

Page 11: bluespec talk

Bluespec Language

• Extended Haskell + Bit Vectors (Data types)

• No clocks.• Modules for OO.

– Rules– Methods– Scheduler

• Data Flow language.– Guarded atomic actions.

Page 12: bluespec talk

Rules

• Atomic Expressions.• Execute when the guard is true.• Run for 1 clock cycle.• Local to a module (private methods).• Can call methods.rule sync_cache(state == Synchronize); case (cache[index]) matches

tagged Valid {.tag, .data, .isDirty}: if (isDirty) begin

writeToMemory({index, tag}, data); enddefault: noAction;

endcaseendrule

Guard

Method Call

Page 13: bluespec talk

Methods

• Set of commands invoked by a rule or other methods.• Public methods in C++.• Perform an Action, Value or ActionValue.method Action get_data(Address addr)

if (state == Ready);Index i = get_index(addr);case (cache[i])

tagged Valid {.tag, .data, .isDirty}:if (tag == get_tag(addr))

sendToProc(addr, data); //hitelse //conflict miss

getFromMemory(addr);endcase

endmethod

Another way of adding guards

Page 14: bluespec talk

Modules

• Consists of Interfaces, Rules and Method implementation.

• Enables Reuse.interface CacheController;

method Action get_data(Address addr);

method Action write_data(Address addr,Value v);

method Action sync();

method Action flush();

endinterface

Page 15: bluespec talk

Summary: Bluespec

All state (e.g., Registers, FIFOs, RAMs, ...) is explicit.Behavior is expressed in terms of guarded atomic actions on the state: Rule: condition action Rules can manipulate state in other modules only via their interfaces.

interface

module

Page 16: bluespec talk

Scheduler

• Generates a static schedule by looking at guard conditions on rules and methods.

• Ensures atomicity.

• Runs non-conflicting rules concurrently.

• Rules are scheduled locally.

• Methods are scheduled globally.

Page 17: bluespec talk

Compiler model

Muxing for each stateelement

1

n

Modules(Current state)

Modules(Next state)

Rules

n

nguard

action

Scheduler

1

n

1

n

“CAN_FIRE” “WILL_FIRE”

Compiler generates a scheduler to pick a non-conflicting subset of “ready” rules

Page 18: bluespec talk

SVA + Bluespec = BSV

Page 19: bluespec talk

System Verilog Assertions (SVA)

• A temporal logic.

• Validate behavior of a design.

• Uses: test benches, formal verifiers , simulation.

• Sequences, Properties.

Page 20: bluespec talk

Sequence

• Simple Sequence

sequence seq;

(x ##1 y)

or

(x ##1 y ##1 z);

endsequence

Page 21: bluespec talk

Sequence

• Simple Sequence

sequence seq;

(x ##1 y)

or

(x ##1 y ##1 z);

endsequence

True on CC1.

Page 22: bluespec talk

Sequence

• Simple Sequence

sequence seq;

(x ##1 y)

or

(x ##1 y ##1 z);

endsequence

True on CC2.

Page 23: bluespec talk

Complex Sequence

sequence reqack;req && data_in == 0##1 data_in > 0 [*3:5]##1 ack && data_in == 0;

endsequence

First clock cycle

Starting Second Clock Cycle for the

next 3-5 clock cycles

Finally data_in is low when we get

an ack.

Page 24: bluespec talk

Properties

• Made up of sequences. • Implication operator |->.• sequence |-> property

property goodbuffer;

(req ##1 data_in > 0)

|-> !fifo_in.full;

endproperty

• sequence |=> property

Page 25: bluespec talk

Assertions

• Properties are checked via assertions.

always assert property (goodbuffer);

Page 26: bluespec talk

Bluespec System Verilog(BSV)

Page 27: bluespec talk

Challenges

• SVA model is clocked.

• Bluespec model is not.

• Some schedules will not be valid; designer intervention required. – Achieved through scheduler configuration.

Page 28: bluespec talk

Compiling assertions

• Sequences and properties are compiled into FSMs.

• An assertion is turned into a module.

• Assertions are run as rules.

• We can use the same Bluespec compilation techniques as before.

Page 29: bluespec talk

Compiling sequences

• x ##1 y

x y end

Page 30: bluespec talk

Assertions in hardware

• Properties can run across multiple clock cycles.

• In software, we just spawn a concurrent thread to check the assertion.

• You can’t do that in hardware.

• Instead we create multiple copies of the same FSM along the length of the sequence.

Page 31: bluespec talk

Assertions in hardware

• always assert x ##1 y

x y end

x y end

or

t=0

x

y

Page 32: bluespec talk

Assertions in hardware

• always assert x ##1 y

x y end

x y end

or

t=1

x

y

Page 33: bluespec talk

Assertions in hardware

• always assert x ##1 y

x y end

x y end

or

t=2

x

y

Page 34: bluespec talk

Composing sequences

• Simple booleans can be generalized into a sequence module.

Page 35: bluespec talk

Other combinations

Page 36: bluespec talk
Page 37: bluespec talk
Page 38: bluespec talk
Page 39: bluespec talk

General model of an assertion

Page 40: bluespec talk

Coverage

• A bunch of productions in SVA are not covered in BSV

• Recursion– Solve halting problem to generate FSMs.– Can be used when recursion depth can be

statically determined.

• Disable iff and other properties.

Page 41: bluespec talk

Case study

Page 42: bluespec talk

functional assertion

• “On a write request only one cache-way is written”

property goodWriteRequest;write_request |=>

if (cache_tag_resp.next_evict_way0)isWrite(way0_req)

&& !isWrite(way1_req)else isWrite(way1_req) && !isWrite(way0_req);

endproperty

Page 43: bluespec talk

Performance assertion

“When a cpu request is made a cache memory read is made in the same cycle. For read requests, either main memory is read or result returned in next cycle.”

property cpu_read_perf;read_request |->

isRead(way0_req) && isRead(way1_req) && isRead(tag_req) ##1 isRead(c2memory_req)

|| isRead(c2p_data);endproperty

Page 44: bluespec talk

Statistic-gathering assertion!

• You couldn’t do this before!• Counting read hitsproperty count_read_hits;read_request |=> isValid(c2p_data);

endproperty

always assert property (count_read_hits)read_hits <= read_hits + 1;

elseread_misses <= read_misses + 1;

Page 45: bluespec talk

Advantages

• Code Reuse. High level semantics.

• High-level programming constructs from Bluespec + the temporal logic ala SVA.

• More tests. Hardware simulation is a lot faster (1000x) than software simulation.

• Dynamic testing.

• Statistics gathering.

Page 46: bluespec talk

Misgivings

• Ad-hoc design (from a theoretical view point)

• Guards may reduce concurrency.• Correct concurrent behavior can’t be

guaranteed.• No public docs.• Tweaking scheduler for clocked model can

be problematic.• Subset of SVA is supported.

Page 47: bluespec talk

Extensions

• BSV could be extended to assertions checked at specific times instead of always.

• Further coverage of SVA• Constraint-guided scheduler.

Page 48: bluespec talk

Compiling Guards

Before compilation

rule r1 (fifo1)

… do r1

… call r2

rule r2 (fifo2)

… do r2

After Compilation

rule r3

(fifo1 and

fifo2)

… do r1

… do r2

Page 49: bluespec talk

Better model

rule r1

(if fifo1)

… do r1

(if fifo2)

… do r2

Now another rule can use fifo1 while fifo2 is being used by r1.

Page 50: bluespec talk

Guards

• No correctness guarantees.

• Reduced concurrency.

Solution:

Transactions in Bluespec.

Page 51: bluespec talk

Questions?

Page 52: bluespec talk

BSV code compliation

function Vector#(64, Complex) ifft (Vector#(64, Complex) in_data);

//Declare vectors Vector#(4,Vector#(64, Complex)) stage_data;

stage_data[0] = in_data; for (Integer stage = 0; stage < 3; stage = stage + 1) stage_data[stage+1] = stage_f(stage,stage_data[stage]); return(stage_data[3]);

Stage_f can be inlined now.But the number of transistors has tripled.

stage_data[1] = stage_f(0,stage_data[0]);stage_data[2] = stage_f(1,stage_data[1]);stage_data[3] = stage_f(2,stage_data[2]);

Page 53: bluespec talk

f g

Folding

Reuse a block over multiple cycles

we expect:

Throughput to

Area to

ff g

decrease – less parallelism

Speed up clock to compensate hyper-linear increase in energy

decrease – reusing a block

Page 54: bluespec talk

802.11a Transmitter Synthesis results (Only the IFFT block is changing)

IFFT Design Area (mm2)

ThroughputLatency

(CLKs/sym)

Min. Freq Required

Pipelined 5.25 04 1.0 MHz

Combinational 4.91 04 1.0 MHz

Folded

(16 Bfly-4s)

3.97 04 1.0 MHz

Super-Folded

(8 Bfly-4s)

3.69 06 1.5 MHz

SF(4 Bfly-4s) 2.45 12 3.0 MHz

SF(2 Bfly-4s) 1.84 24 6.0 MHz

SF (1 Bfly4) 1.52 48 12 MHZ

TSMC .18 micron; numbers reported are before place and route.

The same source code

All these designs were done in less than 24 hours!