Coverage-Driven Verification of HDL IP Cores

Metadata of the chapter that will be visualized inOnlineFirst

Book Title Solutions on Embedded SystemsSeries Title 7818

Chapter Title Coverage-Driven Verification of HDL IP Cores

Copyright Year 2011

Copyright HolderName Springer Science+Business Media B.V.

Corresponding Author Family Name SaponaraParticle

Given Name S.Suffix

Division Department of Information Engineering

Organization Universita di Pisa

Address Pisa, Italy

Email [email protected]

Author Family Name VitulloParticle

Given Name F.Suffix



Address Pisa, Italy

Email

Author Family Name PetriParticle

Given Name E.Suffix

Division

Organization Consorzio Pisa Ricerche- Electronic Systems and Microelectronics Division

Address Pisa, Italy

Email

Author Family Name FanucciParticle

Given Name L.Suffix



Address Pisa, Italy

Email

Author Family Name CoppolaParticle

Given Name M.

Suffix

Division AST Grenoble Lab

Organization STMicroelectronics

Address Grenoble, France

Email

Author Family Name LocatelliParticle

Given Name R.Suffix

Division AST Grenoble Lab

Organization STMicroelectronics

Address Grenoble, France

Email

Abstract This chapter addresses the problem of functional verification of IP cores to be integrated in complex embeddedsystems. After analyzing the limits of methods based on HDL testbenches or formal verification, a pseudo-random coverage-driven approach is presented (verification environment design guidelines together with afinal coverage report summary) and applied to a novel Router IP core design, a key component of Network-on-Chip communication infrastructure in embedded systems.

Keywords (separated by '-') Network-on-Chip - Coverage-driven functional verification - Intellectual Property cores - Multi-ProcessorSystem-on-Chip

UN

CO

RR

ECTE

DPR

OO

F

1 Chapter 82 Coverage-Driven Verification of HDL3 IP Cores

4 Case Study of a Router for Network-on-Chip5 Communication in Embedded Systems

6 S. Saponara, F. Vitullo, E. Petri, L. Fanucci, M. Coppola7 and R. Locatelli

8 8.1 Introduction

9 The progress of nanometric CMOS technologies and design methodologies has10 fostered the development of complex digital designs, thanks to the capability of11 integrating an increasing number of IP cores within a single chip. This trend made12 one typical issue of the embedded and digital systems design flow to become more13 and more critical: the exhaustive functional verification of complex systems.14 In fact, as system complexity grows, the same is for verification tasks which have15 two main crucial points: (i) generating proper testing scenarios that stress key16 features of the DUT (Design Under Test) and (ii) determining the amount of17 different tests needed to reach enough coverage (code and functional) to assert that18 the DUT is bug free w.r.t. the foreseen utilization scenarios. While code coverage19 is about ensuring that all part of the RTL netlist (statements, expressions, branches,20 finite machine states, block instantiations) have been stimulated by the test vectors21 (automatic measure of achieved code coverage is already supported in recent EDA22 tools for HDL IP simulations), functional verification requires a major verification23 engineering effort [1–5]. Indeed, functional verification is about (i) catching24 functional behaviour of the DUT from the specifications document and (ii)25 validating that DUT behaviour for all possible working scenarios (input traffic and26 internal IP state) is consistent with its specification. To be noted that code and

S. Saponara (&) � F. Vitullo � L. FanucciDepartment of Information Engineering, Universita di Pisa, Pisa, Italye-mail: [email protected]

E. PetriConsorzio Pisa Ricerche- Electronic Systems and Microelectronics Division, Pisa, Italy

M. Coppola � R. LocatelliAST Grenoble Lab, STMicroelectronics, Grenoble, France

Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 1/15

M. Conti et al. (eds.), Solutions on Embedded Systems,Lecture Notes in Electrical Engineering 81, DOI: 10.1007/978-94-007-0638-5_8,� Springer Science+Business Media B.V. 2011

1

Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

27 functional verification, on which this work is focused, is the first step of the whole28 testing flow which entails further steps for IP core emulation on prototyping29 platforms (typically based on FPGA), back-end gate-level functional/timing ver-30 ification and finally validation and characterization of the implemented integrated31 circuit. Failure mode analysis and fault-robustness verification [6, 7] (particularly32 important for designs conceived for harsh environment applications) are orthog-33 onal methods not included in what discussed in this work. Today the major part of34 development time and costs for a new IP core are spent on verification rather than35 on HDL design. To reduce development time and design cost, reusability,36 configurability and scalability of the functional verification environment have a37 crucial role, also for subsequent verification steps.38 Traditional verification techniques based on direct testbenches (where the test39 traffic is typically hand-written in HDL as a sequence of input vectors; output40 vectors are calculated a priori and then matched with the ones monitored from the41 DUT) or on formal demonstrations are inefficient when dealing with complex42 designs made up of multiple heterogeneous IP cores [1–5]. Direct testbenches are43 applied to the DUT by means of simulation; they have a poor level of automation44 since most testing traffic scenarios are usually hand-written; even using more high-45 level programming languages (such as C++ or SystemC) to abstract the set of46 possible significant stimuli, the problem of checking (i.e. catching DUT outputs47 and establishing whether they are correct or not) is still to be solved. When direct48 testbenches are used, the output checking tends to be simplified since the user49 knows what to expect from the DUT; still, this approach is time consuming and50 cannot be exploited for complex designs. Formal verification techniques, on the51 other hand, are not based on simulations; instead, the verification engineer tries to52 extract deterministic laws and relationships internal to the DUT exploiting its HDL53 description. Then, under the assumption of some hypotheses (typically represented54 by some set of stimuli) and with the help of an analysis tool, the formal verification55 approach tries to prove one or more theorems (on those stimuli, the DUT always56 behaves in the desired manner). This approach does not need simulations (though57 recently also symbolic simulations have been introduced combining formal tech-58 niques with standard simulation [1]) and is general enough to treat also corner59 cases. However, formal verification proved to be too complex for medium or large60 sized designs because the set of properties that the verification engineer needs to61 formally demonstrate is huge; the well known state explosion problem limits62 model checking, and the cost of theorem proving is prohibitive because of the63 amount of skilled manual guidance it requires. The limitations of these verification64 techniques is the lack of reusability and the excessive amount of time, if compared65 to Time-To-Market needs, which is required to provide a good confidence level66 that the design is bug free.67 To solve the above issues, a coverage-driven methodology for functional ver-68 ification based on pseudo-random simulations is discussed in Sect. 8.2 and applied69 to the case study of a Router IP core for Spidergon NoC communication.70 Section 8.3 briefly describes the NoC approach and presents our novel Spidergon71 STNoC. Section 8.4 first describes the functionalities of the new Spidergon

2 S. Saponara et al.


Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

72 STNoC Router IP and its implementation results in 65 nm CMOS technology and73 than discusses how the verification methodology was customized and applied to it.74 Section 8.5 comments the achieved results and draws some conclusions.

75 8.2 Reusable Methodology for the Functional Verification76 of Platforms

77 Though many hybrid verification techniques have been explored, trying to78 combine the strength points of different approaches, an emerging approach for79 functional verification is represented by the constrained random (or pseudo-ran-80 dom) simulations. It is already partially supported by tools and languages such as81 SystemVerilog or the aspect-oriented [8–10] programming language e (recently82 formalized in the IEEE1647 standard) with Specman by Cadence. The Specman83 tool is the one used in the case study of this work. The basic idea we exploited for84 functional verification is to build an eVC (Verification Component in the e85 language) starting from the DUT specifications and ending up with a software able86 to perform the following tasks: generating user-defined traffic patterns to be driven87 into the DUT; monitoring the DUT outputs and checking them according to the88 rules programmed in the eVC; parsing collected outputs into a functional coverage89 scheme to let the user understand if all possible cases have been stressed. The latter90 is a very important issue and enables a coverage-driven verification: i.e., the user91 continues developing tests and running simulations until there are no holes left in92 the defined functional coverage plan. Therefore, to achieve full functional93 coverage the design of pattern generators, addressed in literature also for the target94 NoC case study [11, 12], is just one of the steps of a complete verification95 methodology and on its own is not enough. Figure 8.1 shows the conceptual96 organization we followed to design an eVC around the DUT. To be noted that,

Fig. 8.1 Activities in a functional verification environment

8 Coverage-Driven Verification of HDL IP Cores 3


Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

97 starting from DUT specifications, not only the eVC architecture, but also a test98 plan and a coverage plan have to be defined.99 The philosophy underlying eVCs differs significantly from traditional verifi-

100 cation methodologies. Rather than using thousands of directed tests, eVCs employ101 automatic generation and a coverage driven methodology. Using automated sce-102 narios generation, eVCs can typically achieve higher coverage percentages of the103 protocol w.r.t. hand-written tests. For instance, in our work 100% code coverage104 and functional coverage was achieved enhancing the test base with the addition of105 few directed tests, derived after coverage analysis, which allowed to exercise the106 remaining corner cases. To this aim, besides eVC development, some HDL and107 scripts were also developed and HDL probes were defined to increase observ-108 ability and controllability of the netlist. An eVC for a given IP core or a given109 protocol can be thought as final product and it does not need to be rewritten from110 project to project, allowing for significant reusability. By following proper coding111 guidelines and design best practices, the eVC of an IP core can be easily extended,112 reused and integrated as part of bigger eVCs for more complex designs when113 moving from module to system level verification. This kind of approach to114 functional verification overcomes the limitations of traditional verification tech-115 niques and improves time-to-market. Figure 8.2 shows the main eVC blocks that116 have to be designed to properly stimulate and check the DUT. The env block117 represents an instance of the entire verification environment. The Config unit118 inside it is the user front-end for the configuration of environment’s attributes and119 behavior. For each port of the interface, the eVC typically implements an agent,120 instantiating it in the environment. Agents can emulate the behavior of a legal

Agents are the main modules of an eVC.

The Signal map allows the agent to bind to the DUT’s RTL (to drive and read signals)

Adapt the agent to different verification and/or DUT needs.

Monitor capabilities are common to Passive and Active agents.

Coverage definitions.

Checker module: what the DUT must/must not do!

A Sequence is a parameterized set of stimuli to be sent to DUT. A bus transaction or burst An instruction to a CPU An error injection

[…]

The Bus Functional Model actually drives the stimuli to the DUT’s RTL.

Sequences may be composed to easily build complex scenarios.

Fig. 8.2 Building blocks of an eVC architecture

B&

WIN

PR

INT



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

121 device, and they have standard construction and functionality; they are usually122 bound to a sub-set of the DUT port map and are built on top of building blocks that123 implement agent specific functionality:124 Config: a group of fields that allow configuration of the agent’s attributes and125 behavior.126 Signal Map: a unit that contains external ports for each of the HW signals that127 the agent must access as it interacts with the DUT.128 Sequence Driver: a unit instance that serves as a coordinator for running user-129 defined tests; traffic patterns are implemented as sequences (seq).130 BFM (Bus Functional Model): a unit instance that interacts with the DUT and131 both drives and samples the DUT signals.132 Monitor: a unit instance that passively monitors (samples) the DUT signals and133 supplies interpretation of the monitored activity to the other components of the134 agent. Monitors can emit events when they notice interesting things happening in135 the DUT or on the DUT interface. They can also check for correct behavior or136 collect coverage data.

137 8.3 Spidergon NoC Communication Design

138 NoC is an emerging design paradigm for building scalable packet-switched139 communication infrastructures connecting hundreds of IP cores. Its utilization in140 MPSoCs which strongly depend on the on-chip communication architecture will141 overcome the scalability limitations of traditional solutions like point to point142 communication and centralized on-chip busses [13–16]. In fact, as the number of143 components grows, traditional interconnect systems can degrade the global system144 performances in terms of area, power or throughput [15]. Beyond that, the floor-145 planning of long communication wires in presence of so many IP cores is also very146 problematic because bad wire routing lowers circuit timing performance. NoCs147 adopt some of the networking ISO-OSI abstraction layers (physical, data-link,148 network, transport) for decoupling the design of the IP cores from the physical149 implementation of the interconnect infrastructure, thus speeding up the design flow150 and increasing scalability and reusability, see Fig. 8.3. A NoC is a distributed151 network consisting of some main building blocks, the most important of them152 being the Network Interface (NI) and the Router (R). The way these blocks are153 interconnected determines the NoC topology (e.g. 2D mesh and Ring have been154 proposed in the past) which is responsible for packet switching efficiency between155 the start and the end point of communication and must be designed to avoid traffic156 congestion. Also the Quality of Service (QoS) applied to the routing policy of157 packets is a key element.158 The NoC infrastructure we designed, called Spidergon STNoC, is based on the159 vertex-symmetric topology shown in Fig. 8.3b (note that Spidergon topology160 includes the Ring one). In our design, Routers are in charge of delivering packets161 towards the destination IP while providing buffering and QoS services by means of



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F162 traffic arbitration policies and link scheduling on packets crossing their network.163 All issues related to security, error management and clock frequency, bus size and164 protocol conversions are managed at network boundaries by the NIs. Indeed the165 NIs are the peripheral building blocks of the NoC, decoupling computation from166 communication, which provide protocol abstraction by encoding in the packet’s167 header all data to guarantee successful end to end data delivery between cores168 (transport layer) and all QoS information needed by the router at network layer.169 Figure 8.4 shows the format of the STNoC packet carrying header and payload170 data which are physically split in header and payload flits; they are all routed171 through the same path across the network.172 Such architecture avoids many of the problems affecting older communication173 systems. Since it is a distributed network, data packets switch from a building174 block to another never being carried by long physical wires which are basically175 split and distributed along the network. Each NI collects traffic from the core it is176 connected to, independently from the others, and then converts such traffic177 into packets sending them to the network of routers, where they move along

Fig. 8.4 Spidergon STNoC packet format

(a) (b)

SpidergonPlatform

R

RR

R

R

R

RR

NI

NI

NI

NI

NI

NI

NI

NI

IP

IP

IP

IP

IP

IP

IP

IP

SpidergonPlatform

R

RR

R

R

R

RR

NI

NI

NI

NI

NI

NI

NI

NI

IP

IP

IP

IP

IP

IP

IP

IP

Fig. 8.3 a Internet ISO-OSI layers and mapping onto NoC, b Spidergon STNoC architecture

B&

WIN

PR

INT

B&

WIN

PR

INT



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

178 different paths. This allows parallel communication flows, which was not possible179 with shared on-chip bus lines. The Spidergon STNoC is also suitable to imple-180 ment a globally-asynchronous-locally-synchronous (GALS) communication181 paradigm [17, 18]. First of all, this means that long global clock wiring is not182 needed; furthermore, each part of the SoC can be fed with a different clock source.183 It is task of the distributed network to synchronize one clock domain with another184 with special modules in its building blocks. In this scenario the distributed network185 decouples the working frequencies of the plugged IP cores/sub-systems. Different186 kinds of links (synchronous, mesochronous and asynchronous) have been designed187 for the Spidergon STNoC [17]. All the problems avoided by the NoC are not188 charge free and the price to pay is the complexity of the network. Indeed, it is189 important that in the whole computing system the NoC only contributes to a small190 percentage of overall area and power consumption. Thus the Spidergon STNoC191 has been conceived as scalable, depending on the surrounding computing infra-192 structure, in terms of topology, building blocks number and implemented services.193 The Spidergon STNoC has been also designed to provide compatibility with194 affirmed bus standards largely adopted in IP cores such as RISC processors,195 microcontrollers and DSPs (e.g. STBus and AXI NIs have been designed). Such196 feature ensures a smooth transition from old bus-centric systems to MPSoCs with197 the novel NoC interconnect.

198 8.4 Verification Environment for the199 Spidergon STNoC Router IP

200 The functional verification of Spidergon STNoC building blocks has been carried201 out by building a coverage driven simulation environment. In this case study, the202 DUT is a Spidergon STNoC Router IP core but the same approach has been203 followed for NIs and links. In this Section, first the functional features of the204 designed Router and its implementation results in submicron CMOS technology205 are described and then the design of the verification environment is presented.206 The Router architecture has been defined according to a parametric and mod-207 ular approach using VHDL language. The Spidergon router, see Fig. 8.5, can be208 connected through two unidirectional links with three other routers into directions209 Right (R), Left (L) and Across (A), plus the fourth connection to the local Network210 Interface, used as the network entry/exit point. The physical link consists of two211 unidirectional data channels, Downstream (DS) and Upstream (US), with the212 relevant handshake signals to realize a credit-based hop-by-hop flow control213 (val and credit in Fig. 8.5). The router adopts wormhole packet-switching, where a214 packet is subdivided into flits and all of them follow the same path reserved for the215 header. The routing algorithm is deterministic, so that always the same path is216 chosen between a source and a destination node, even if multiple paths exist. This217 choice avoids costly flit reordering at packet reception. The idea is to move along218 the ring, in the proper direction, to reach nodes which are near the source node,



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

219 using the Across link as first or last hop to jump to a part of the network that is too220 far away. The router uses a simple source-based routing: the entire path is encoded221 in the packet header, so each router has just to extract the forward information,222 without any need of computation or any look-up table. The routing scheme, along223 with a proper QoS scheduling policy, is free of starvation issues. The router avoids224 deadlock also by deploying Virtual Networks (VNs in Figs. 8.5 and 8.6, also

Fig. 8.5 S-STNoC router ports breakdown and Downstream (DS) and Upstream (US) channelswith the relevant handshake signals

Fig. 8.6 Environment adopted for the Router functional verification



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

225 called Virtual Channels, VCs). VNs provide logical links over the same shared226 physical channels, by establishing a number of independently allocated flit buffers227 in the corresponding transmitter/receiver nodes. Currently the two request and228 response logical paths are implemented on top of two disjoint VNs for sharing the229 physical link bandwidth and maximizing wire efficiency. The parametric number230 of VNs supported by the router can lead to advanced routing schemes or inde-231 pendent QoS traffic classes for real time and low latency flows. The credit-based232 flow control works on a per flit basis. Flits can be sent in the US direction only if233 there are enough credits, i.e. the DS interface of the receiving component has234 enough free locations in its input buffer to store incoming flits. Output Queues on235 US ports can be instantiated for enhanced performance, avoiding head-of-line236 blocking. Queues are shared among input flows to limit costly time/space speed up237 factors and they have the bypass feature to reduce the minimum router crossing238 latency in case of low traffic conditions. The architecture also supports the239 possibility of not instantiating the Output Queue for low cost implementations,240 when performance or traffic types do not require output buffering. It is optionally241 possible to instantiate a separate Output Queue for each input port directed to that242 output. This configuration increases global network performance when a lot of243 traffic is concentrated towards the considered output. The applied QoS mechanism244 is the Fair Bandwidth Allocation (FBA). It allows for a flexible, scalable and low245 cost management of the allocation of the available bandwidth. The requested246 bandwidth value is programmed at injection point (Network Interface) and is not247 explicitly linked to the path of a data flow through the router like in other NoC248 architectures. It avoids complexity inside the router by providing all necessary249 information in the network header and limiting the router behavior to a simple250 two-step arbitration. When all data flows have the same bandwidth reservation, the251 arbitration algorithm becomes one of the following: Round Robin (RR), Least252 Recently Used (LRU) or fixed priority schemes, configurable by the user.253 The router has been implemented for different configurations in different254 (90 nm, 65 nm and 45 nm) STMicroelectronics CMOS standard-cells technolo-255 gies always achieving optimal trade-off between performance and complexity.256 As example in 65 nm 1.1 V standard-cells CMOS technology a full Router con-257 figuration with all 4 ports enabled (Spidergon topology) and all with 2 VNs, a size258 of 72 bits on VN1 (request path) and 64 bits on the VN2 (response path), using259 input buffers (IB) and output queues (OQ) able to store respectively 4 and 5 flits,260 with LRU arbitration and FBA management, has a circuit complexity of roughly261 70 Kgates (including the flip-flop implementation of IB and OQ memory resour-262 ces). It achieves a clock frequency of 500 MHz, i.e. at least 32 Gbps data transfer263 for US and DS channels, with a low-leakage library version ensuring a static power264 consumption less than 100 lW. By using a standard-cells library version opti-265 mized for high-speed, with the same IP configuration and CMOS technology node,266 clock frequencies up to 1 GHz are met, i.e. up to 64 Gbps data transfer per267 channel, but with an increased static power of 1 mW. Obviously, by changing the268 Router configuration different results are achieved: as example a basic Router with269 3 ports (Ring topology without the Across link), 36-bit size for the flits, 1 VN,



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

270 no OQs instantiated, has a circuit complexity lower than 9 Kgates and achieves a271 clock frequency up to 1 GHz, i.e. at least 36 Gbps data rate, with a static power272 consumption of roughly 200 lW (the static power is 10 lW if targeting 500 MHz273 frequency).274 Proper component operation should be assessed for every Router configuration.275 However, only a subset of all possible configurations has been actually exploited276 for assembling platforms to synthesize. For such Router configurations a full277 regression set of test simulations has been carried out to check correct component278 operation when stressed with several different traffic scenarios. The DUT of the279 simulation was the single Router block. Figure 8.6 shows a sample Router DUT280 surrounded by the corresponding verification environment; in this example, a281 4–port Router with 2 VNs and both US/DS directions on each port is considered.282 Each Router port has its own relevant BFM units which drive and monitor traffic283 (master agents are connected to DS ports and slave agents are connected to US284 ports). All BFM units are connected with both the monitor unit and the checker285 unit. The former is in charge of protocol checking and data coverage, the latter286 implements a scoreboard for checking correct routing and other traffic properties.287 At the interface level, the following categories of checks were implemented:

288 • Routing: (i) each transmitted packet exits one and only one time; (ii) each289 transmitted packet is output from the correct port (according to header infor-290 mation); (iii) flits within a packet are kept in the correct order and are not291 interleaved with flits from other packets.292 • Credit-based protocol: (i) when a flit is read from the Input Buffer, a credit is293 sent back by the router; the valid signal is high on an output port when a294 significant flit is transmitted.295 • Network Layer Header: FBA bit management (the FBA bits of a packet are296 correctly updated when it exits the router).

297 The different BFMs may be configured to generate different kinds of traffic298 scenarios so to reach the desired functional coverage. For example, configuring299 properly a test file, it is possible to generate packets that are sourced from 3 ports300 and all having the same destination port; this is useful for stressing arbiters and301 output queues as well as for achieving some corner cases coverage points. The302 developed environment discovered a number of bugs that it was not possible to303 find with hand-written HDL testbenches.304 To achieve full code and functional coverage by exercising some corner cases,305 for some Router configurations a deeper level of checking has been implemented306 by means of internal probes (e.g. monitoring internal DUT signals). Indeed, while307 some DUT functionalities may be easily checked/covered without knowledge of308 timing such as the data integrity from one port to another, check rules for other309 DUT functionalities depend on the timing of what is happening on the various310 ports; as example, the buffers status (empty/full) depends on the rate with which311 packets are injected into the DUT, besides the destination of those packets; also312 the correct behavior of an arbitration algorithm depends on the timing with which313 the different packets accessing the same resource are served by the Router.



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

314 Therefore, to achieve 100% functional verification, the basic coverage-driven315 approach should be enhanced either through the design of a software golden model316 able to predict timing-dependent properties (very time consuming approach) or317 alternatively, as we have done in this work, using internal probes to rapidly318 achieve detailed information about operation of internal router blocks and state319 machines. An internal probe means monitoring a hardware signal within the320 router: for example monitoring the inputs of an arbiter block allows to gather321 extensive coverage information about arbitration scenarios and successful appli-322 cation of a specific arbitration algorithm. The drawback of this approach is that323 using probes requires a deep knowledge of Router VHDL implementation and it is324 a hard-to-reuse solution. Figure 8.7 shows the UML diagram of the probes portion325 of the Router eVC. Thanks to these additional eVC units, some internal arbitra-326 tions and QoS mechanisms have been verified such as the LRU arbitration of the

get_number_of_ls(list of port_kind_t, llist of port_kind_t)()get_links_needing_ls(list of port_kind_t, llist of port_kind_t)()

number_of_vnslist of vn_layer_ulist of link_schedulers_u

router_internals_u

parentlinklink_strls_type : ls_t

link_scheduler_u

1

1..4

get_my_inputs(whoami:port_kind_t,dss:port_kind_t[])()

vn_index[1]us_links[1..4] : port_kind_tds_links[1..4] : port_kind_tni_oqs[1] : multi_oqs_toutput_queues[1..4] : output_queue_uarbiters[1..4] : arbiter_type_uinput_buffers[1..4] : input_buffer_u

vn_layer_u

1

1..2

parentlinklink_strqueues : multi_oqs_tvn_index

output_queue_u

base_path(is_NI_port:bool)()

vn_index[1]us_links[1]ds_links[1]my_inputs[1..3] : port_kind_tarbiter_type[1] : arbiter_type_tqueue_block[1] : output_queue_uls_req_filterswitch[1]ni_oqs[1] : multi_oqs_t

arbiter_type_u

1

1..4

1

1..4

ONE_OQ output_queue_u

THREE_OQS output_queue_t

base_path()()

vn_index[1]us_links[1]ds_links[1]link_type[1] : link_type_t

input_buffer_u

1

1..4

Fig. 8.7 UML diagram of the probes portion of the Router eVC



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

327 link scheduler; furthermore, this allowed for the implementation of additional328 coverage points related to arbiters and internal buffers.329 To be noted that many checks must be performed independently for each port,330 so the eVC architecture needs to be highly modular and configurable. It is worth331 noting that all the checks and coverage points may be selectively enabled to meet332 various user requirements. For example, if there is no interest in testing queue333 utilization, simulations can be speeded up by disabling the internal probes portion334 of the eVC (which works with cycle level accuracy), leaving only the transaction335 level architecture. It is also possible to disable a single check; for example, by336 disabling the check about routing the Router eVC can be used as traffic monitor on337 a single US or DS bus. To conclude the verification flow, several coverage points338 have been defined to describe all possible traffic scenarios that can stress the339 Router; they may be conceptually grouped into the three main categories reported340 in Table 8.1: Traffic flow, queue utilization, arbitration mechanisms. Some341 coverage points (such as the ones in queue utilization) are available only as part of342 the internal probes eVC extension, because it is possible to easily retrieve some343 information only accessing the micro-architecture of some blocks.

344 8.5 Results and Conclusions

345 The developed eVC has been used to test several router configurations; enough test346 cases have been implemented to achieve 100% full functional and code coverage347 for all points defined in the coverage plan. Table 8.1 shows some details about the348 coverage points in the plan with their corresponding range (e.g. the values that349 need to be induced in the DUT) and the result achieved. In Table 8.1 we also350 highlight the coverage points for which probes have been used. After the time351 spent for developing the eVC software, such results were achieved in a relatively352 short time thanks to the constrained-random traffic generation.353 After functional and code verification by simulations, the effectiveness of the354 verification flow was also demonstrated by several STNoC platforms implemented355 on FPGA that were successfully emulated with real-life scenarios where multiple356 ARM11 processors and memory modules were connected through the Spidergon357 STNoC.358 The Router eVC has also been used as a component for the functional verifi-359 cation of platforms involving several NIs and Routers; different eVCs, all devel-360 oped according the same paradigm, were put together obtaining a complex361 platform verification environment.362 Finally, it is worth noting that the Spidergon STNoC architecture has been chosen363 as the inter-tile interconnect of the SHAPES MPSoC European project [17, 19, 20]364 involving ATMEL, University of Pisa and STMicroelectronics. In SHAPES,365 8 identical IP tiles building a complex scalable multiprocessor are interconnected by366 means of a packet-switched Spidergon STNoC network. A typical SHAPES tile367 contains a VLIW floating-point DSP, a RISC processor based on the ARM926 core,



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

Tab

le8.

1Im

plem

ente

dco

vera

gegr

oups

and

test

resu

lts

Cat

egor

yC

over

age

desc

ript

ion

Ran

geR

esul

ts

Tra

ffic

flow

Rou

ting

path

sR

outi

ngpa

ths

betw

een

each

pair

ofpo

rts

100%

(8gr

oups

,1

per

port

,pe

rV

N)

Pac

ket

leng

th(2

leve

lscr

oss,

com

bina

tion

ofro

utin

gpa

ths,

num

ber

offl

its

ina

pack

et)

1–10

100%

(8cr

oss

grou

ps,

1pe

rpo

rt,

per

VN

)

Rou

ting

fiel

dsin

pack

ethe

ader

(4le

vels

cros

s,co

mbi

nati

onof

DS

port

s,di

rect

ions

,de

stin

atio

nID

a )

All

rout

ing

deci

sion

scen

ario

son

all

DS

port

s10

0%(2

cros

sgr

oups

,1

per

VN

)

Que

ueut

iliz

atio

nIn

put

Buf

fer

util

izat

ion

(pro

bes)

Em

pty–

Ful

l10

0%(8

grou

ps,

1pe

rpo

rt,

per

VN

)O

utpu

tQ

ueue

util

izat

ion

(pro

bes)

Em

pty–

Ful

l10

0%(8

grou

ps,

1pe

rpo

rt,

per

VN

)B

ypas

sop

erat

ion

(pro

bes)

Que

ueby

pass

ed/n

otby

pass

ed10

0%(8

grou

ps,

1pe

rpo

rt,

per

VN

)C

redi

tav

aila

bili

ty(p

robe

s)0…

510

0%(4

grou

ps,

1pe

rU

Spo

rt)

Arb

itra

tion

mec

hani

sms

Fai

rB

andw

idth

All

ocat

ion

(FB

A)

(3le

vels

cros

sco

ver:

com

bina

tion

ofU

Spo

rt,

FB

Ast

atus

ofin

com

ing

pack

et,F

BA

stat

usof

outg

oing

pack

et)

FB

Atr

ansi

tion

s(0

?0,

0?

1,1

?0,

1?

1)10

0%(8

cros

sgr

oups

,1

per

port

,pe

rV

N)

Con

curr

ent

requ

ests

scen

ario

s(p

robe

s)A

llpo

ssib

leco

mbi

nati

ons

ofco

ncur

rent

pack

etre

ques

tsto

the

arbi

ters

100%

(8gr

oups

,1

per

VN

)

aT

hedi

rect

ions

and

the

dest

inat

ion

IDar

ene

twor

kla

yer

head

erfi

elds

whi

char

een

code

din

the

head

ofea

chpa

cket

and

are

used

byth

eR

oute

rto

take

rout

ing

deci

sion

s



Au

tho

r P

roo

f

UN

CO

RR

ECTE

DPR

OO

F

368 a Distributed Network Processor (DNP) for extra tile communication and includes369 the interface to the NoC (NI), on-chip memories and a set of peripherals for off-chip370 communication. The back-end of the 8-tile SHAPES architecture in 45 nm CMOS371 technology has been successfully realized.372 This work has extended the WISES2009 conference paper [21].

373 References

374 1. Bhadra J, Abadir MS, Ray S, Wang L-C (2007) A survey of hybrid techniques for functional375 verification. IEEE Des Test Comput 24(2):112–122376 2. Bartley MG, Galpin D, Blackmore T (2002) A comparison of three verification techniques:377 directed testing, pseudo-random testing and property checking. In: Proc. 39th Design378 Automation Conf. (DAC02), ACM Press, New York, 2002, pp 819–823379 3. OSCI (2003) SystemC Verification standard specification version 1.0e. http://www.380 systemc.org, May, 2003381 4. Yuan J, Shen J, Abraham J, Aziz A (1997) On combining formal and informal verification.382 In: Proc. Int’l Conf. Computer-Aided Verific., LNCS 1254, Springer, Heidelberg, 1997,383 pp 376–387384 5. Sumners R, Bhadra J, Abraham J (2000) Automatic validation test generation using extracted385 control models. In: Proc. IEEE Int’l Conf. VLSI Design, pp 312–320386 6. Eghbal A et al (2009) Fault injection-based evaluation of a synchronous NoC router. In: Proc.387 IEEE IOLTS’09, pp 212–214388 7. Mariani R, Boschi G (2007) A systematic approach for Failure Modes and Effects Analysis of389 System-On-Chips. In: Proc. IEEE IOLTS’07, pp 187–188390 8. Berman V (2005) An update on IEEE P1647: the e system verification language. IEEE Des391 Test Comput 22(5):484–486392 9. Murphy G, Schwanninger C (2006) Guest editors’ introduction: aspect-oriented393 programming. IEEE Softw 23(1):20–23394 10. Palnitkar S (2003) Design verification with e. Prentice Hall, Upper Saddle River395 11. Al-Badi R et al (2009) A parameterized NoC simulator using OMNet++. In: Proc. IEEE396 ICUMT’09, pp 1–7397 12. Wen H-H et al (2009) Design of an on-line configurable traffic generator for NoC. In: Proc.398 IEEE ASID, pp 556–559399 13. Benini L, De Micheli G (2002) Networks on chip: A new SoC paradigm. IEEE Comput400 35(1):70–78401 14. Muttersbach J, Villiger T, Fichtner W (2000) Practical design of globally-asynchronous402 locally-synchronous systems. In: IEEE ASYNC pp 52–59403 15. Gyu Lee H et al (2007) On-chip communication architecture exploration: a quantitative404 evaluation of point-to-point, bus, and network-on-chip approaches. ACM Transactions on405 Design Automation of Electronic Systems 12(3)406 16. Grammatikakis MD, Coppola M, Maruccia G, Locatelli R, Pieralisi L (2008) Design of cost-407 efficient interconnect processing units: Spidergon STNoC. CRC Press, Boca Raton408 17. Vitullo FM, L’insalata NE, Petri E, Saponara S, Fanucci L, Casula M, Locatelli R, Coppola409 M (2008) Low-complexity link microarchitecture for mesochronous communication in410 networks on chip. IEEE Trans Comput 57:1196–2203411 18. Rahman M et al (2009) Efficient 2DMesh Network on Chip (NoC) considering GALS412 approach. In: Proc. IEEE ICCIT, pp 841–846413 19. Paolucci PS, Lo Cicero F, Lonardo A, Perra M, Rossetti D, Sidore C, Vicini P, Coppola M,414 Raffo L, Mereu G, Palumbo F, Fanucci L, Saponara S, Vitullo F (2007) Introduction to the415 tiled HW architecture of SHAPES. Proc Int Conf Des Autom Test Eur 1:77–82



Au

tho

r P

roo

f

http://www.systemc.org

http://www.systemc.org

UN

CO

RR

ECTE

DPR

OO

F

416 20. Paolucci PS, Jerraya A, Leupers R, Thiele L, Vicini P (2006) SHAPES: a tiled scalable417 software hardware architecture platform for embedded systems. In: Proc. Fourth Int’l Conf.418 Hardware/Software Codesign and System Synthesis, pp 167–172419 21. Saponara S et al (2009) A reusable coverage-driven verification environment for Network-420 on-Chip communication in embedded system platforms. In: IEEE WISES 2009, pp 71–77



Au

tho

r P

roo

f

MARKED PROOF

Please correct and return this set

Instruction to printer

Leave unchanged under matter to remain

through single character, rule or underline

New matter followed by

or

or

or

or

or

or

or

or

or

and/or

and/or

e.g.

e.g.

under character

over character

new character

new characters

through all characters to be deleted

through letter or

through characters

under matter to be changed





Encircle matter to be changed

(As above)

(As above)

(As above)

(As above)

(As above)

(As above)

(As above)

(As above)

linking characters

through character or

where required

between characters or

words affected

through character or

where required

or

indicated in the margin

Delete

Substitute character or

substitute part of one or

more word(s)Change to italics

Change to capitals

Change to small capitals

Change to bold type

Change to bold italic

Change to lower case

Change italic to upright type

Change bold to non-bold type

Insert ‘superior’ character

Insert ‘inferior’ character

Insert full stop

Insert comma

Insert single quotation marks

Insert double quotation marks

Insert hyphen

Start new paragraph

No new paragraph

Transpose

Close up

Insert or substitute space

between characters or words

Reduce space betweencharacters or words

Insert in text the matter

Textual mark Marginal mark

Please use the proof correction marks shown below for all alterations and corrections. If you

in dark ink and are made well within the page margins.

wish to return your proof by fax you should ensure that all amendments are written clearly

Documents

Coverage-Driven Verification of HDL IP Cores