Upload
independent
View
1
Download
0
Embed Size (px)
Citation preview
Metadata of the chapter that will be visualized inOnlineFirst
Book Title Solutions on Embedded SystemsSeries Title 7818
Chapter Title Coverage-Driven Verification of HDL IP Cores
Copyright Year 2011
Copyright HolderName Springer Science+Business Media B.V.
Corresponding Author Family Name SaponaraParticle
Given Name S.Suffix
Division Department of Information Engineering
Organization Universita di Pisa
Address Pisa, Italy
Email [email protected]
Author Family Name VitulloParticle
Given Name F.Suffix
Division Department of Information Engineering
Organization Universita di Pisa
Address Pisa, Italy
Author Family Name PetriParticle
Given Name E.Suffix
Division
Organization Consorzio Pisa Ricerche- Electronic Systems and Microelectronics Division
Address Pisa, Italy
Author Family Name FanucciParticle
Given Name L.Suffix
Division Department of Information Engineering
Organization Universita di Pisa
Address Pisa, Italy
Author Family Name CoppolaParticle
Given Name M.
Suffix
Division AST Grenoble Lab
Organization STMicroelectronics
Address Grenoble, France
Author Family Name LocatelliParticle
Given Name R.Suffix
Division AST Grenoble Lab
Organization STMicroelectronics
Address Grenoble, France
Abstract This chapter addresses the problem of functional verification of IP cores to be integrated in complex embeddedsystems. After analyzing the limits of methods based on HDL testbenches or formal verification, a pseudo-random coverage-driven approach is presented (verification environment design guidelines together with afinal coverage report summary) and applied to a novel Router IP core design, a key component of Network-on-Chip communication infrastructure in embedded systems.
Keywords (separated by '-') Network-on-Chip - Coverage-driven functional verification - Intellectual Property cores - Multi-ProcessorSystem-on-Chip
UN
CO
RR
ECTE
DPR
OO
F
1 Chapter 82 Coverage-Driven Verification of HDL3 IP Cores
4 Case Study of a Router for Network-on-Chip5 Communication in Embedded Systems
6 S. Saponara, F. Vitullo, E. Petri, L. Fanucci, M. Coppola7 and R. Locatelli
8 8.1 Introduction
9 The progress of nanometric CMOS technologies and design methodologies has10 fostered the development of complex digital designs, thanks to the capability of11 integrating an increasing number of IP cores within a single chip. This trend made12 one typical issue of the embedded and digital systems design flow to become more13 and more critical: the exhaustive functional verification of complex systems.14 In fact, as system complexity grows, the same is for verification tasks which have15 two main crucial points: (i) generating proper testing scenarios that stress key16 features of the DUT (Design Under Test) and (ii) determining the amount of17 different tests needed to reach enough coverage (code and functional) to assert that18 the DUT is bug free w.r.t. the foreseen utilization scenarios. While code coverage19 is about ensuring that all part of the RTL netlist (statements, expressions, branches,20 finite machine states, block instantiations) have been stimulated by the test vectors21 (automatic measure of achieved code coverage is already supported in recent EDA22 tools for HDL IP simulations), functional verification requires a major verification23 engineering effort [1–5]. Indeed, functional verification is about (i) catching24 functional behaviour of the DUT from the specifications document and (ii)25 validating that DUT behaviour for all possible working scenarios (input traffic and26 internal IP state) is consistent with its specification. To be noted that code and
S. Saponara (&) � F. Vitullo � L. FanucciDepartment of Information Engineering, Universita di Pisa, Pisa, Italye-mail: [email protected]
E. PetriConsorzio Pisa Ricerche- Electronic Systems and Microelectronics Division, Pisa, Italy
M. Coppola � R. LocatelliAST Grenoble Lab, STMicroelectronics, Grenoble, France
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 1/15
M. Conti et al. (eds.), Solutions on Embedded Systems,Lecture Notes in Electrical Engineering 81, DOI: 10.1007/978-94-007-0638-5_8,� Springer Science+Business Media B.V. 2011
1
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
27 functional verification, on which this work is focused, is the first step of the whole28 testing flow which entails further steps for IP core emulation on prototyping29 platforms (typically based on FPGA), back-end gate-level functional/timing ver-30 ification and finally validation and characterization of the implemented integrated31 circuit. Failure mode analysis and fault-robustness verification [6, 7] (particularly32 important for designs conceived for harsh environment applications) are orthog-33 onal methods not included in what discussed in this work. Today the major part of34 development time and costs for a new IP core are spent on verification rather than35 on HDL design. To reduce development time and design cost, reusability,36 configurability and scalability of the functional verification environment have a37 crucial role, also for subsequent verification steps.38 Traditional verification techniques based on direct testbenches (where the test39 traffic is typically hand-written in HDL as a sequence of input vectors; output40 vectors are calculated a priori and then matched with the ones monitored from the41 DUT) or on formal demonstrations are inefficient when dealing with complex42 designs made up of multiple heterogeneous IP cores [1–5]. Direct testbenches are43 applied to the DUT by means of simulation; they have a poor level of automation44 since most testing traffic scenarios are usually hand-written; even using more high-45 level programming languages (such as C++ or SystemC) to abstract the set of46 possible significant stimuli, the problem of checking (i.e. catching DUT outputs47 and establishing whether they are correct or not) is still to be solved. When direct48 testbenches are used, the output checking tends to be simplified since the user49 knows what to expect from the DUT; still, this approach is time consuming and50 cannot be exploited for complex designs. Formal verification techniques, on the51 other hand, are not based on simulations; instead, the verification engineer tries to52 extract deterministic laws and relationships internal to the DUT exploiting its HDL53 description. Then, under the assumption of some hypotheses (typically represented54 by some set of stimuli) and with the help of an analysis tool, the formal verification55 approach tries to prove one or more theorems (on those stimuli, the DUT always56 behaves in the desired manner). This approach does not need simulations (though57 recently also symbolic simulations have been introduced combining formal tech-58 niques with standard simulation [1]) and is general enough to treat also corner59 cases. However, formal verification proved to be too complex for medium or large60 sized designs because the set of properties that the verification engineer needs to61 formally demonstrate is huge; the well known state explosion problem limits62 model checking, and the cost of theorem proving is prohibitive because of the63 amount of skilled manual guidance it requires. The limitations of these verification64 techniques is the lack of reusability and the excessive amount of time, if compared65 to Time-To-Market needs, which is required to provide a good confidence level66 that the design is bug free.67 To solve the above issues, a coverage-driven methodology for functional ver-68 ification based on pseudo-random simulations is discussed in Sect. 8.2 and applied69 to the case study of a Router IP core for Spidergon NoC communication.70 Section 8.3 briefly describes the NoC approach and presents our novel Spidergon71 STNoC. Section 8.4 first describes the functionalities of the new Spidergon
2 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 2/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
72 STNoC Router IP and its implementation results in 65 nm CMOS technology and73 than discusses how the verification methodology was customized and applied to it.74 Section 8.5 comments the achieved results and draws some conclusions.
75 8.2 Reusable Methodology for the Functional Verification76 of Platforms
77 Though many hybrid verification techniques have been explored, trying to78 combine the strength points of different approaches, an emerging approach for79 functional verification is represented by the constrained random (or pseudo-ran-80 dom) simulations. It is already partially supported by tools and languages such as81 SystemVerilog or the aspect-oriented [8–10] programming language e (recently82 formalized in the IEEE1647 standard) with Specman by Cadence. The Specman83 tool is the one used in the case study of this work. The basic idea we exploited for84 functional verification is to build an eVC (Verification Component in the e85 language) starting from the DUT specifications and ending up with a software able86 to perform the following tasks: generating user-defined traffic patterns to be driven87 into the DUT; monitoring the DUT outputs and checking them according to the88 rules programmed in the eVC; parsing collected outputs into a functional coverage89 scheme to let the user understand if all possible cases have been stressed. The latter90 is a very important issue and enables a coverage-driven verification: i.e., the user91 continues developing tests and running simulations until there are no holes left in92 the defined functional coverage plan. Therefore, to achieve full functional93 coverage the design of pattern generators, addressed in literature also for the target94 NoC case study [11, 12], is just one of the steps of a complete verification95 methodology and on its own is not enough. Figure 8.1 shows the conceptual96 organization we followed to design an eVC around the DUT. To be noted that,
Fig. 8.1 Activities in a functional verification environment
8 Coverage-Driven Verification of HDL IP Cores 3
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 3/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
97 starting from DUT specifications, not only the eVC architecture, but also a test98 plan and a coverage plan have to be defined.99 The philosophy underlying eVCs differs significantly from traditional verifi-
100 cation methodologies. Rather than using thousands of directed tests, eVCs employ101 automatic generation and a coverage driven methodology. Using automated sce-102 narios generation, eVCs can typically achieve higher coverage percentages of the103 protocol w.r.t. hand-written tests. For instance, in our work 100% code coverage104 and functional coverage was achieved enhancing the test base with the addition of105 few directed tests, derived after coverage analysis, which allowed to exercise the106 remaining corner cases. To this aim, besides eVC development, some HDL and107 scripts were also developed and HDL probes were defined to increase observ-108 ability and controllability of the netlist. An eVC for a given IP core or a given109 protocol can be thought as final product and it does not need to be rewritten from110 project to project, allowing for significant reusability. By following proper coding111 guidelines and design best practices, the eVC of an IP core can be easily extended,112 reused and integrated as part of bigger eVCs for more complex designs when113 moving from module to system level verification. This kind of approach to114 functional verification overcomes the limitations of traditional verification tech-115 niques and improves time-to-market. Figure 8.2 shows the main eVC blocks that116 have to be designed to properly stimulate and check the DUT. The env block117 represents an instance of the entire verification environment. The Config unit118 inside it is the user front-end for the configuration of environment’s attributes and119 behavior. For each port of the interface, the eVC typically implements an agent,120 instantiating it in the environment. Agents can emulate the behavior of a legal
Agents are the main modules of an eVC.
The Signal map allows the agent to bind to the DUT’s RTL (to drive and read signals)
Adapt the agent to different verification and/or DUT needs.
Monitor capabilities are common to Passive and Active agents.
Coverage definitions.
Checker module: what the DUT must/must not do!
A Sequence is a parameterized set of stimuli to be sent to DUT. A bus transaction or burst An instruction to a CPU An error injection
[…]
The Bus Functional Model actually drives the stimuli to the DUT’s RTL.
Sequences may be composed to easily build complex scenarios.
Fig. 8.2 Building blocks of an eVC architecture
B&
WIN
PR
INT
4 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 4/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
121 device, and they have standard construction and functionality; they are usually122 bound to a sub-set of the DUT port map and are built on top of building blocks that123 implement agent specific functionality:124 Config: a group of fields that allow configuration of the agent’s attributes and125 behavior.126 Signal Map: a unit that contains external ports for each of the HW signals that127 the agent must access as it interacts with the DUT.128 Sequence Driver: a unit instance that serves as a coordinator for running user-129 defined tests; traffic patterns are implemented as sequences (seq).130 BFM (Bus Functional Model): a unit instance that interacts with the DUT and131 both drives and samples the DUT signals.132 Monitor: a unit instance that passively monitors (samples) the DUT signals and133 supplies interpretation of the monitored activity to the other components of the134 agent. Monitors can emit events when they notice interesting things happening in135 the DUT or on the DUT interface. They can also check for correct behavior or136 collect coverage data.
137 8.3 Spidergon NoC Communication Design
138 NoC is an emerging design paradigm for building scalable packet-switched139 communication infrastructures connecting hundreds of IP cores. Its utilization in140 MPSoCs which strongly depend on the on-chip communication architecture will141 overcome the scalability limitations of traditional solutions like point to point142 communication and centralized on-chip busses [13–16]. In fact, as the number of143 components grows, traditional interconnect systems can degrade the global system144 performances in terms of area, power or throughput [15]. Beyond that, the floor-145 planning of long communication wires in presence of so many IP cores is also very146 problematic because bad wire routing lowers circuit timing performance. NoCs147 adopt some of the networking ISO-OSI abstraction layers (physical, data-link,148 network, transport) for decoupling the design of the IP cores from the physical149 implementation of the interconnect infrastructure, thus speeding up the design flow150 and increasing scalability and reusability, see Fig. 8.3. A NoC is a distributed151 network consisting of some main building blocks, the most important of them152 being the Network Interface (NI) and the Router (R). The way these blocks are153 interconnected determines the NoC topology (e.g. 2D mesh and Ring have been154 proposed in the past) which is responsible for packet switching efficiency between155 the start and the end point of communication and must be designed to avoid traffic156 congestion. Also the Quality of Service (QoS) applied to the routing policy of157 packets is a key element.158 The NoC infrastructure we designed, called Spidergon STNoC, is based on the159 vertex-symmetric topology shown in Fig. 8.3b (note that Spidergon topology160 includes the Ring one). In our design, Routers are in charge of delivering packets161 towards the destination IP while providing buffering and QoS services by means of
8 Coverage-Driven Verification of HDL IP Cores 5
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 5/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F162 traffic arbitration policies and link scheduling on packets crossing their network.163 All issues related to security, error management and clock frequency, bus size and164 protocol conversions are managed at network boundaries by the NIs. Indeed the165 NIs are the peripheral building blocks of the NoC, decoupling computation from166 communication, which provide protocol abstraction by encoding in the packet’s167 header all data to guarantee successful end to end data delivery between cores168 (transport layer) and all QoS information needed by the router at network layer.169 Figure 8.4 shows the format of the STNoC packet carrying header and payload170 data which are physically split in header and payload flits; they are all routed171 through the same path across the network.172 Such architecture avoids many of the problems affecting older communication173 systems. Since it is a distributed network, data packets switch from a building174 block to another never being carried by long physical wires which are basically175 split and distributed along the network. Each NI collects traffic from the core it is176 connected to, independently from the others, and then converts such traffic177 into packets sending them to the network of routers, where they move along
Fig. 8.4 Spidergon STNoC packet format
(a) (b)
SpidergonPlatform
R
RR
R
R
R
RR
NI
NI
NI
NI
NI
NI
NI
NI
IP
IP
IP
IP
IP
IP
IP
IP
SpidergonPlatform
R
RR
R
R
R
RR
NI
NI
NI
NI
NI
NI
NI
NI
IP
IP
IP
IP
IP
IP
IP
IP
Fig. 8.3 a Internet ISO-OSI layers and mapping onto NoC, b Spidergon STNoC architecture
B&
WIN
PR
INT
B&
WIN
PR
INT
6 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 6/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
178 different paths. This allows parallel communication flows, which was not possible179 with shared on-chip bus lines. The Spidergon STNoC is also suitable to imple-180 ment a globally-asynchronous-locally-synchronous (GALS) communication181 paradigm [17, 18]. First of all, this means that long global clock wiring is not182 needed; furthermore, each part of the SoC can be fed with a different clock source.183 It is task of the distributed network to synchronize one clock domain with another184 with special modules in its building blocks. In this scenario the distributed network185 decouples the working frequencies of the plugged IP cores/sub-systems. Different186 kinds of links (synchronous, mesochronous and asynchronous) have been designed187 for the Spidergon STNoC [17]. All the problems avoided by the NoC are not188 charge free and the price to pay is the complexity of the network. Indeed, it is189 important that in the whole computing system the NoC only contributes to a small190 percentage of overall area and power consumption. Thus the Spidergon STNoC191 has been conceived as scalable, depending on the surrounding computing infra-192 structure, in terms of topology, building blocks number and implemented services.193 The Spidergon STNoC has been also designed to provide compatibility with194 affirmed bus standards largely adopted in IP cores such as RISC processors,195 microcontrollers and DSPs (e.g. STBus and AXI NIs have been designed). Such196 feature ensures a smooth transition from old bus-centric systems to MPSoCs with197 the novel NoC interconnect.
198 8.4 Verification Environment for the199 Spidergon STNoC Router IP
200 The functional verification of Spidergon STNoC building blocks has been carried201 out by building a coverage driven simulation environment. In this case study, the202 DUT is a Spidergon STNoC Router IP core but the same approach has been203 followed for NIs and links. In this Section, first the functional features of the204 designed Router and its implementation results in submicron CMOS technology205 are described and then the design of the verification environment is presented.206 The Router architecture has been defined according to a parametric and mod-207 ular approach using VHDL language. The Spidergon router, see Fig. 8.5, can be208 connected through two unidirectional links with three other routers into directions209 Right (R), Left (L) and Across (A), plus the fourth connection to the local Network210 Interface, used as the network entry/exit point. The physical link consists of two211 unidirectional data channels, Downstream (DS) and Upstream (US), with the212 relevant handshake signals to realize a credit-based hop-by-hop flow control213 (val and credit in Fig. 8.5). The router adopts wormhole packet-switching, where a214 packet is subdivided into flits and all of them follow the same path reserved for the215 header. The routing algorithm is deterministic, so that always the same path is216 chosen between a source and a destination node, even if multiple paths exist. This217 choice avoids costly flit reordering at packet reception. The idea is to move along218 the ring, in the proper direction, to reach nodes which are near the source node,
8 Coverage-Driven Verification of HDL IP Cores 7
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 7/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
219 using the Across link as first or last hop to jump to a part of the network that is too220 far away. The router uses a simple source-based routing: the entire path is encoded221 in the packet header, so each router has just to extract the forward information,222 without any need of computation or any look-up table. The routing scheme, along223 with a proper QoS scheduling policy, is free of starvation issues. The router avoids224 deadlock also by deploying Virtual Networks (VNs in Figs. 8.5 and 8.6, also
Fig. 8.5 S-STNoC router ports breakdown and Downstream (DS) and Upstream (US) channelswith the relevant handshake signals
Fig. 8.6 Environment adopted for the Router functional verification
8 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 8/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
225 called Virtual Channels, VCs). VNs provide logical links over the same shared226 physical channels, by establishing a number of independently allocated flit buffers227 in the corresponding transmitter/receiver nodes. Currently the two request and228 response logical paths are implemented on top of two disjoint VNs for sharing the229 physical link bandwidth and maximizing wire efficiency. The parametric number230 of VNs supported by the router can lead to advanced routing schemes or inde-231 pendent QoS traffic classes for real time and low latency flows. The credit-based232 flow control works on a per flit basis. Flits can be sent in the US direction only if233 there are enough credits, i.e. the DS interface of the receiving component has234 enough free locations in its input buffer to store incoming flits. Output Queues on235 US ports can be instantiated for enhanced performance, avoiding head-of-line236 blocking. Queues are shared among input flows to limit costly time/space speed up237 factors and they have the bypass feature to reduce the minimum router crossing238 latency in case of low traffic conditions. The architecture also supports the239 possibility of not instantiating the Output Queue for low cost implementations,240 when performance or traffic types do not require output buffering. It is optionally241 possible to instantiate a separate Output Queue for each input port directed to that242 output. This configuration increases global network performance when a lot of243 traffic is concentrated towards the considered output. The applied QoS mechanism244 is the Fair Bandwidth Allocation (FBA). It allows for a flexible, scalable and low245 cost management of the allocation of the available bandwidth. The requested246 bandwidth value is programmed at injection point (Network Interface) and is not247 explicitly linked to the path of a data flow through the router like in other NoC248 architectures. It avoids complexity inside the router by providing all necessary249 information in the network header and limiting the router behavior to a simple250 two-step arbitration. When all data flows have the same bandwidth reservation, the251 arbitration algorithm becomes one of the following: Round Robin (RR), Least252 Recently Used (LRU) or fixed priority schemes, configurable by the user.253 The router has been implemented for different configurations in different254 (90 nm, 65 nm and 45 nm) STMicroelectronics CMOS standard-cells technolo-255 gies always achieving optimal trade-off between performance and complexity.256 As example in 65 nm 1.1 V standard-cells CMOS technology a full Router con-257 figuration with all 4 ports enabled (Spidergon topology) and all with 2 VNs, a size258 of 72 bits on VN1 (request path) and 64 bits on the VN2 (response path), using259 input buffers (IB) and output queues (OQ) able to store respectively 4 and 5 flits,260 with LRU arbitration and FBA management, has a circuit complexity of roughly261 70 Kgates (including the flip-flop implementation of IB and OQ memory resour-262 ces). It achieves a clock frequency of 500 MHz, i.e. at least 32 Gbps data transfer263 for US and DS channels, with a low-leakage library version ensuring a static power264 consumption less than 100 lW. By using a standard-cells library version opti-265 mized for high-speed, with the same IP configuration and CMOS technology node,266 clock frequencies up to 1 GHz are met, i.e. up to 64 Gbps data transfer per267 channel, but with an increased static power of 1 mW. Obviously, by changing the268 Router configuration different results are achieved: as example a basic Router with269 3 ports (Ring topology without the Across link), 36-bit size for the flits, 1 VN,
8 Coverage-Driven Verification of HDL IP Cores 9
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 9/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
270 no OQs instantiated, has a circuit complexity lower than 9 Kgates and achieves a271 clock frequency up to 1 GHz, i.e. at least 36 Gbps data rate, with a static power272 consumption of roughly 200 lW (the static power is 10 lW if targeting 500 MHz273 frequency).274 Proper component operation should be assessed for every Router configuration.275 However, only a subset of all possible configurations has been actually exploited276 for assembling platforms to synthesize. For such Router configurations a full277 regression set of test simulations has been carried out to check correct component278 operation when stressed with several different traffic scenarios. The DUT of the279 simulation was the single Router block. Figure 8.6 shows a sample Router DUT280 surrounded by the corresponding verification environment; in this example, a281 4–port Router with 2 VNs and both US/DS directions on each port is considered.282 Each Router port has its own relevant BFM units which drive and monitor traffic283 (master agents are connected to DS ports and slave agents are connected to US284 ports). All BFM units are connected with both the monitor unit and the checker285 unit. The former is in charge of protocol checking and data coverage, the latter286 implements a scoreboard for checking correct routing and other traffic properties.287 At the interface level, the following categories of checks were implemented:
288 • Routing: (i) each transmitted packet exits one and only one time; (ii) each289 transmitted packet is output from the correct port (according to header infor-290 mation); (iii) flits within a packet are kept in the correct order and are not291 interleaved with flits from other packets.292 • Credit-based protocol: (i) when a flit is read from the Input Buffer, a credit is293 sent back by the router; the valid signal is high on an output port when a294 significant flit is transmitted.295 • Network Layer Header: FBA bit management (the FBA bits of a packet are296 correctly updated when it exits the router).
297 The different BFMs may be configured to generate different kinds of traffic298 scenarios so to reach the desired functional coverage. For example, configuring299 properly a test file, it is possible to generate packets that are sourced from 3 ports300 and all having the same destination port; this is useful for stressing arbiters and301 output queues as well as for achieving some corner cases coverage points. The302 developed environment discovered a number of bugs that it was not possible to303 find with hand-written HDL testbenches.304 To achieve full code and functional coverage by exercising some corner cases,305 for some Router configurations a deeper level of checking has been implemented306 by means of internal probes (e.g. monitoring internal DUT signals). Indeed, while307 some DUT functionalities may be easily checked/covered without knowledge of308 timing such as the data integrity from one port to another, check rules for other309 DUT functionalities depend on the timing of what is happening on the various310 ports; as example, the buffers status (empty/full) depends on the rate with which311 packets are injected into the DUT, besides the destination of those packets; also312 the correct behavior of an arbitration algorithm depends on the timing with which313 the different packets accessing the same resource are served by the Router.
10 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 10/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
314 Therefore, to achieve 100% functional verification, the basic coverage-driven315 approach should be enhanced either through the design of a software golden model316 able to predict timing-dependent properties (very time consuming approach) or317 alternatively, as we have done in this work, using internal probes to rapidly318 achieve detailed information about operation of internal router blocks and state319 machines. An internal probe means monitoring a hardware signal within the320 router: for example monitoring the inputs of an arbiter block allows to gather321 extensive coverage information about arbitration scenarios and successful appli-322 cation of a specific arbitration algorithm. The drawback of this approach is that323 using probes requires a deep knowledge of Router VHDL implementation and it is324 a hard-to-reuse solution. Figure 8.7 shows the UML diagram of the probes portion325 of the Router eVC. Thanks to these additional eVC units, some internal arbitra-326 tions and QoS mechanisms have been verified such as the LRU arbitration of the
get_number_of_ls(list of port_kind_t, llist of port_kind_t)()get_links_needing_ls(list of port_kind_t, llist of port_kind_t)()
number_of_vnslist of vn_layer_ulist of link_schedulers_u
router_internals_u
parentlinklink_strls_type : ls_t
link_scheduler_u
1
1..4
get_my_inputs(whoami:port_kind_t,dss:port_kind_t[])()
vn_index[1]us_links[1..4] : port_kind_tds_links[1..4] : port_kind_tni_oqs[1] : multi_oqs_toutput_queues[1..4] : output_queue_uarbiters[1..4] : arbiter_type_uinput_buffers[1..4] : input_buffer_u
vn_layer_u
1
1..2
parentlinklink_strqueues : multi_oqs_tvn_index
output_queue_u
base_path(is_NI_port:bool)()
vn_index[1]us_links[1]ds_links[1]my_inputs[1..3] : port_kind_tarbiter_type[1] : arbiter_type_tqueue_block[1] : output_queue_uls_req_filterswitch[1]ni_oqs[1] : multi_oqs_t
arbiter_type_u
1
1..4
1
1..4
ONE_OQ output_queue_u
THREE_OQS output_queue_t
base_path()()
vn_index[1]us_links[1]ds_links[1]link_type[1] : link_type_t
input_buffer_u
1
1..4
Fig. 8.7 UML diagram of the probes portion of the Router eVC
8 Coverage-Driven Verification of HDL IP Cores 11
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 11/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
327 link scheduler; furthermore, this allowed for the implementation of additional328 coverage points related to arbiters and internal buffers.329 To be noted that many checks must be performed independently for each port,330 so the eVC architecture needs to be highly modular and configurable. It is worth331 noting that all the checks and coverage points may be selectively enabled to meet332 various user requirements. For example, if there is no interest in testing queue333 utilization, simulations can be speeded up by disabling the internal probes portion334 of the eVC (which works with cycle level accuracy), leaving only the transaction335 level architecture. It is also possible to disable a single check; for example, by336 disabling the check about routing the Router eVC can be used as traffic monitor on337 a single US or DS bus. To conclude the verification flow, several coverage points338 have been defined to describe all possible traffic scenarios that can stress the339 Router; they may be conceptually grouped into the three main categories reported340 in Table 8.1: Traffic flow, queue utilization, arbitration mechanisms. Some341 coverage points (such as the ones in queue utilization) are available only as part of342 the internal probes eVC extension, because it is possible to easily retrieve some343 information only accessing the micro-architecture of some blocks.
344 8.5 Results and Conclusions
345 The developed eVC has been used to test several router configurations; enough test346 cases have been implemented to achieve 100% full functional and code coverage347 for all points defined in the coverage plan. Table 8.1 shows some details about the348 coverage points in the plan with their corresponding range (e.g. the values that349 need to be induced in the DUT) and the result achieved. In Table 8.1 we also350 highlight the coverage points for which probes have been used. After the time351 spent for developing the eVC software, such results were achieved in a relatively352 short time thanks to the constrained-random traffic generation.353 After functional and code verification by simulations, the effectiveness of the354 verification flow was also demonstrated by several STNoC platforms implemented355 on FPGA that were successfully emulated with real-life scenarios where multiple356 ARM11 processors and memory modules were connected through the Spidergon357 STNoC.358 The Router eVC has also been used as a component for the functional verifi-359 cation of platforms involving several NIs and Routers; different eVCs, all devel-360 oped according the same paradigm, were put together obtaining a complex361 platform verification environment.362 Finally, it is worth noting that the Spidergon STNoC architecture has been chosen363 as the inter-tile interconnect of the SHAPES MPSoC European project [17, 19, 20]364 involving ATMEL, University of Pisa and STMicroelectronics. In SHAPES,365 8 identical IP tiles building a complex scalable multiprocessor are interconnected by366 means of a packet-switched Spidergon STNoC network. A typical SHAPES tile367 contains a VLIW floating-point DSP, a RISC processor based on the ARM926 core,
12 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 12/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
Tab
le8.
1Im
plem
ente
dco
vera
gegr
oups
and
test
resu
lts
Cat
egor
yC
over
age
desc
ript
ion
Ran
geR
esul
ts
Tra
ffic
flow
Rou
ting
path
sR
outi
ngpa
ths
betw
een
each
pair
ofpo
rts
100%
(8gr
oups
,1
per
port
,pe
rV
N)
Pac
ket
leng
th(2
leve
lscr
oss,
com
bina
tion
ofro
utin
gpa
ths,
num
ber
offl
its
ina
pack
et)
1–10
100%
(8cr
oss
grou
ps,
1pe
rpo
rt,
per
VN
)
Rou
ting
fiel
dsin
pack
ethe
ader
(4le
vels
cros
s,co
mbi
nati
onof
DS
port
s,di
rect
ions
,de
stin
atio
nID
a )
All
rout
ing
deci
sion
scen
ario
son
all
DS
port
s10
0%(2
cros
sgr
oups
,1
per
VN
)
Que
ueut
iliz
atio
nIn
put
Buf
fer
util
izat
ion
(pro
bes)
Em
pty–
Ful
l10
0%(8
grou
ps,
1pe
rpo
rt,
per
VN
)O
utpu
tQ
ueue
util
izat
ion
(pro
bes)
Em
pty–
Ful
l10
0%(8
grou
ps,
1pe
rpo
rt,
per
VN
)B
ypas
sop
erat
ion
(pro
bes)
Que
ueby
pass
ed/n
otby
pass
ed10
0%(8
grou
ps,
1pe
rpo
rt,
per
VN
)C
redi
tav
aila
bili
ty(p
robe
s)0…
510
0%(4
grou
ps,
1pe
rU
Spo
rt)
Arb
itra
tion
mec
hani
sms
Fai
rB
andw
idth
All
ocat
ion
(FB
A)
(3le
vels
cros
sco
ver:
com
bina
tion
ofU
Spo
rt,
FB
Ast
atus
ofin
com
ing
pack
et,F
BA
stat
usof
outg
oing
pack
et)
FB
Atr
ansi
tion
s(0
?0,
0?
1,1
?0,
1?
1)10
0%(8
cros
sgr
oups
,1
per
port
,pe
rV
N)
Con
curr
ent
requ
ests
scen
ario
s(p
robe
s)A
llpo
ssib
leco
mbi
nati
ons
ofco
ncur
rent
pack
etre
ques
tsto
the
arbi
ters
100%
(8gr
oups
,1
per
VN
)
aT
hedi
rect
ions
and
the
dest
inat
ion
IDar
ene
twor
kla
yer
head
erfi
elds
whi
char
een
code
din
the
head
ofea
chpa
cket
and
are
used
byth
eR
oute
rto
take
rout
ing
deci
sion
s
8 Coverage-Driven Verification of HDL IP Cores 13
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 13/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
368 a Distributed Network Processor (DNP) for extra tile communication and includes369 the interface to the NoC (NI), on-chip memories and a set of peripherals for off-chip370 communication. The back-end of the 8-tile SHAPES architecture in 45 nm CMOS371 technology has been successfully realized.372 This work has extended the WISES2009 conference paper [21].
373 References
374 1. Bhadra J, Abadir MS, Ray S, Wang L-C (2007) A survey of hybrid techniques for functional375 verification. IEEE Des Test Comput 24(2):112–122376 2. Bartley MG, Galpin D, Blackmore T (2002) A comparison of three verification techniques:377 directed testing, pseudo-random testing and property checking. In: Proc. 39th Design378 Automation Conf. (DAC02), ACM Press, New York, 2002, pp 819–823379 3. OSCI (2003) SystemC Verification standard specification version 1.0e. http://www.380 systemc.org, May, 2003381 4. Yuan J, Shen J, Abraham J, Aziz A (1997) On combining formal and informal verification.382 In: Proc. Int’l Conf. Computer-Aided Verific., LNCS 1254, Springer, Heidelberg, 1997,383 pp 376–387384 5. Sumners R, Bhadra J, Abraham J (2000) Automatic validation test generation using extracted385 control models. In: Proc. IEEE Int’l Conf. VLSI Design, pp 312–320386 6. Eghbal A et al (2009) Fault injection-based evaluation of a synchronous NoC router. In: Proc.387 IEEE IOLTS’09, pp 212–214388 7. Mariani R, Boschi G (2007) A systematic approach for Failure Modes and Effects Analysis of389 System-On-Chips. In: Proc. IEEE IOLTS’07, pp 187–188390 8. Berman V (2005) An update on IEEE P1647: the e system verification language. IEEE Des391 Test Comput 22(5):484–486392 9. Murphy G, Schwanninger C (2006) Guest editors’ introduction: aspect-oriented393 programming. IEEE Softw 23(1):20–23394 10. Palnitkar S (2003) Design verification with e. Prentice Hall, Upper Saddle River395 11. Al-Badi R et al (2009) A parameterized NoC simulator using OMNet++. In: Proc. IEEE396 ICUMT’09, pp 1–7397 12. Wen H-H et al (2009) Design of an on-line configurable traffic generator for NoC. In: Proc.398 IEEE ASID, pp 556–559399 13. Benini L, De Micheli G (2002) Networks on chip: A new SoC paradigm. IEEE Comput400 35(1):70–78401 14. Muttersbach J, Villiger T, Fichtner W (2000) Practical design of globally-asynchronous402 locally-synchronous systems. In: IEEE ASYNC pp 52–59403 15. Gyu Lee H et al (2007) On-chip communication architecture exploration: a quantitative404 evaluation of point-to-point, bus, and network-on-chip approaches. ACM Transactions on405 Design Automation of Electronic Systems 12(3)406 16. Grammatikakis MD, Coppola M, Maruccia G, Locatelli R, Pieralisi L (2008) Design of cost-407 efficient interconnect processing units: Spidergon STNoC. CRC Press, Boca Raton408 17. Vitullo FM, L’insalata NE, Petri E, Saponara S, Fanucci L, Casula M, Locatelli R, Coppola409 M (2008) Low-complexity link microarchitecture for mesochronous communication in410 networks on chip. IEEE Trans Comput 57:1196–2203411 18. Rahman M et al (2009) Efficient 2DMesh Network on Chip (NoC) considering GALS412 approach. In: Proc. IEEE ICCIT, pp 841–846413 19. Paolucci PS, Lo Cicero F, Lonardo A, Perra M, Rossetti D, Sidore C, Vicini P, Coppola M,414 Raffo L, Mereu G, Palumbo F, Fanucci L, Saponara S, Vitullo F (2007) Introduction to the415 tiled HW architecture of SHAPES. Proc Int Conf Des Autom Test Eur 1:77–82
14 S. Saponara et al.
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 14/15
Au
tho
r P
roo
f
UN
CO
RR
ECTE
DPR
OO
F
416 20. Paolucci PS, Jerraya A, Leupers R, Thiele L, Vicini P (2006) SHAPES: a tiled scalable417 software hardware architecture platform for embedded systems. In: Proc. Fourth Int’l Conf.418 Hardware/Software Codesign and System Synthesis, pp 167–172419 21. Saponara S et al (2009) A reusable coverage-driven verification environment for Network-420 on-Chip communication in embedded system platforms. In: IEEE WISES 2009, pp 71–77
8 Coverage-Driven Verification of HDL IP Cores 15
Layout: T1 Standard SC Book ID: 194217_1_En Book ISBN: 978-94-007-0637-8Chapter No.: 8 Date: 18-1-2011 Page: 15/15
Au
tho
r P
roo
f
MARKED PROOF
Please correct and return this set
Instruction to printer
Leave unchanged under matter to remain
through single character, rule or underline
New matter followed by
or
or
or
or
or
or
or
or
or
and/or
and/or
e.g.
e.g.
under character
over character
new character
new characters
through all characters to be deleted
through letter or
through characters
under matter to be changed
under matter to be changed
under matter to be changed
under matter to be changed
under matter to be changed
Encircle matter to be changed
(As above)
(As above)
(As above)
(As above)
(As above)
(As above)
(As above)
(As above)
linking characters
through character or
where required
between characters or
words affected
through character or
where required
or
indicated in the margin
Delete
Substitute character or
substitute part of one or
more word(s)Change to italics
Change to capitals
Change to small capitals
Change to bold type
Change to bold italic
Change to lower case
Change italic to upright type
Change bold to non-bold type
Insert ‘superior’ character
Insert ‘inferior’ character
Insert full stop
Insert comma
Insert single quotation marks
Insert double quotation marks
Insert hyphen
Start new paragraph
No new paragraph
Transpose
Close up
Insert or substitute space
between characters or words
Reduce space betweencharacters or words
Insert in text the matter
Textual mark Marginal mark
Please use the proof correction marks shown below for all alterations and corrections. If you
in dark ink and are made well within the page margins.
wish to return your proof by fax you should ensure that all amendments are written clearly