1
SHAPES scalable Software Hardware Architecture Platform for Embedded Systems Hardware Architecture Atmel Roma, INFN Roma, ST Microelectronics Grenoble, Università di Cagliari, Università di Pisa Spidergon-STNoC Spidergon-STNoC (S-STNoC) is the Network on Chip (NoC) technology currently developed in STMicroelectronics and it is made of a network of micro-routers interconnected in a Spidergon topology. The main task of S-STNoC in SHAPES is to provide the inter-tile on-chip communication services. Different tiles can be connected on the same silicon chip by the DNP to S-STNoC interconnection. In fact the DNP has a port connected to S- STNoC Network Interface (S-STNoC NI), a block which is responsible to map ingoing DNP to S-STNoC packets in a S-STNoC compatible packet format and outgoing S- STNoC to DNP packets in DNP packet format. Contacts [email protected] [email protected] [email protected] References www.shapes-p.org Paolucci, P. S., Jerraya, A. A., Leupers, R., Thiele, L., and Vicini, P. 2006. "SHAPES: a tiled scalable software hardware architecture platform for embedded systems." In Proceedings of the 4th international Conference on Hardware/Software Codesign and System Synthesis (Seoul, Korea, 2006). CODES+ISSS '06. ACM Press, 167-172. The DNP receives commands issued by the masters (RISC or DSP processors) on the AMBA AHB slave port and uses up two AHB master ports to sustain the data flow between the communication source and destination. For intra-tile communications the set of master and slave interface ports would suffice. For inter-tile communication, which requires the cooperation of at least two DNPs hosted on different tiles, the DNP is equipped with a set of inter-tile interfaces. A first set of those interfaces are used for off-chip communications on the 3DT (3D toroidal next-neighbors topology) and on a collective communication tree (CTV). A second set of interfaces will be used for on-chip communication through a dedicated NOC architecture. DNP is packet based: it sends, receives and routes packets with a fixed size header and a variable size payload. SHAPES Tiled Architecture The key objectives for deep sub-micron technologies is the minimization of wire delay problems and the management of the design complexity in projects characterized by several hundreds of million gates. The challenge is to find a scalable HW/SW design style for future CMOS technologies. Tiled architectures suggest a possible path: “small” processing tiles connected by “short (next neighbour) wires”. We propose a tiled architectural strategy that employs building blocks that are scalable on future silicon technologies. The RDT Tile The basic SHAPES tile, RDT (Risc Dsp Tile) , will be equipped with: one RISC microcontroller; one VLIW FP DSP (mAgicV); one DNP (Distributed Network Processor); a DPM (Distributed Program Memory); a DDM (Distributed Data Memory); the POT (a set of Peripherals On Tile); one interface for the DXM (Distributed External Memory) owned by each tile. Distributed Network Processor The INFN Distributed Network Processor (DNP) provides data transport functionalities to the tile, performing inter-tile communications both on-chip and off-chip. The DNP also acts as a DMA controller for intra-tile communications (e.g. between the DSP internal data memory DDM and the tile memory DXM). mAgicV VLIW DSP ATMEL mAgicV VLIW DSP is a low power high performance numerical processor operating on IEEE 754 40-bit extended precision floating- point (10 operations per cycle) and 32-bit integer data (16 operations per cycle). Power consumption is expected to be 200 mW/GFlops at 65 nm. mAgicV is equipped with one AHB master port and one AHB slave port for system-on-chip integration. It has 256 data registers, 64 address registers, 10 independent arithmetic operating units, 2 independent address generation units and a DMA engine driving the AHB Master Port. 3DT X- 3DT Y+ 3DT Y- 3DT Z+ 3DT Z- BUS Master (to read from intra-tile memories) BUS Slave (receive commands from RISC&DSP) 3DT X+ (forward/receive inter-tile off chip packets) NoC (to forward/receive inter-tile on-chip packets) BUS Master (simultaneous intra-tile memory write) Collective communication DNP The DNP uses deterministic routing policy to implement communications on the 3D torus network. In this case, there is a fixed rule for a packet to traverse the network, hopping from one DNP to the other, until the final destination is reached. DNP routing is dead-lock free, implementing Virtual Channels. The DNP may host a small processor to implement some advanced features in an easy way, without resorting to complex VHDL coding while allowing for easy upgrade and bug fixing. About programming models, the DNP will support three different kinds of network API: Systolic Communications, Remote Data Memory Access (RDMA) and Message Passing Interface (MPI). Networks-on-Chip are mainly based on three kinds of component: 1) Routers 2) Network Interfaces 3) Physical Link Each Router is point-to-point connected to three routers and to one Network Interface. NoCs hide the interconnect specific implementation details to the IP resources interfaced. All that is needed for an external IP to transmit data through the network is a specific designed Network Interface. SHAPES tiled hardware architecture The SHAPES RDT Tile The Interfaces of the Distributed Network Processor mAgicV Floating-Point VLIW DSP Spidergon NoC Topology

SHAPES scalable Software Hardware Architecture Platform for Embedded Systems Hardware Architecture Atmel Roma, INFN Roma, ST Microelectronics Grenoble,

Embed Size (px)

Citation preview

Page 1: SHAPES scalable Software Hardware Architecture Platform for Embedded Systems Hardware Architecture Atmel Roma, INFN Roma, ST Microelectronics Grenoble,

SHAPESscalable Software Hardware Architecture Platform for Embedded Systems

Hardware ArchitectureAtmel Roma, INFN Roma, ST Microelectronics Grenoble,Università di Cagliari, Università di Pisa

Spidergon-STNoCSpidergon-STNoC (S-STNoC) is the Network on Chip (NoC) technology currently developed in STMicroelectronics and it is made of a network of micro-routers interconnected in a Spidergon topology.

The main task of S-STNoC in SHAPES is to provide the inter-tile on-chip communication services. Different tiles can be connected on the same silicon chip by the DNP to S-STNoC interconnection. In fact the DNP has a port connected to S-STNoC Network Interface (S-STNoC NI), a block which is responsible to map ingoing DNP to S-STNoC packets in a S-STNoC compatible packet format and outgoing S-STNoC to DNP packets in DNP packet format.

[email protected]

[email protected]

[email protected]

References

www.shapes-p.org

Paolucci, P. S., Jerraya, A. A., Leupers, R., Thiele, L., and Vicini, P. 2006. "SHAPES: a tiled scalable software hardware architecture platform for embedded systems." In Proceedings of the 4th international Conference on Hardware/Software Codesign and System Synthesis (Seoul, Korea, 2006). CODES+ISSS '06. ACM Press, 167-172.

The DNP receives commands issued by the masters (RISC or DSP processors) on the AMBA AHB slave port and uses up two AHB master ports to sustain the data flow between the communication source and destination.

For intra-tile communications the set of master and slave interface ports would suffice. For inter-tile communication, which requires the cooperation of at least two DNPs hosted on different tiles, the DNP is equipped with a set of inter-tile interfaces.

A first set of those interfaces are used for off-chip communications on the 3DT (3D toroidal next-neighbors topology) and on a collective communication tree (CTV). A second set of interfaces will be used for on-chip communication through a dedicated NOC architecture.

DNP is packet based: it sends, receives and routes packets with a fixed size header and a variable size payload.

SHAPES Tiled ArchitectureThe key objectives for deep sub-micron technologies is the minimization of wire delay problems and the management of the design complexity in projects characterized by several hundreds of million gates.

The challenge is to find a scalable HW/SW design style for future CMOS technologies.

Tiled architectures suggest a possible path: “small” processing tiles connected by “short (next neighbour) wires”.

We propose a tiled architectural strategy that employs building blocks that are scalable on future silicon technologies.

The RDT TileThe basic SHAPES tile, RDT (Risc Dsp Tile) , will be equipped with:

•one RISC microcontroller;

•one VLIW FP DSP (mAgicV);

•one DNP (Distributed Network Processor);

•a DPM (Distributed Program Memory);

•a DDM (Distributed Data Memory);

•the POT (a set of Peripherals On Tile);

•one interface for the DXM (Distributed External Memory) owned by each tile.

Distributed Network ProcessorThe INFN Distributed Network Processor (DNP) provides data transport functionalities to the tile, performing inter-tile communications both on-chip and off-chip.

The DNP also acts as a DMA controller for intra-tile communications (e.g. between the DSP internal data memory DDM and the tile memory DXM).

mAgicV VLIW DSPATMEL mAgicV VLIW DSP is a low power high performance numerical processor operating on IEEE 754 40-bit extended precision floating-point (10 operations per cycle) and 32-bit integer data (16 operations per cycle).

Power consumption is expected to be 200 mW/GFlops at 65 nm.

mAgicV is equipped with one AHB master port and one AHB slave port for system-on-chip integration.

It has 256 data registers, 64 address registers, 10 independent arithmetic operating units, 2 independent address generation units and a DMA engine driving the AHB Master Port.

3DT X-

3DT Y+

3DT Y-

3DT Z+

3DT Z-

BUS Master (to read from intra-tile memories)

BUS Slave (receive commands from RISC&DSP)

3DT X+ (forward/receive inter-tile off chip packets)

NoC (to forward/receive inter-tile on-chip packets)

BUS Master (simultaneous intra-tile memory write)

Collective communication

DN

P

The DNP uses deterministic routing policy to implement communications on the 3D torus network. In this case, there is a fixed rule for a packet to traverse the network, hopping from one DNP to the other, until the final destination is reached.

DNP routing is dead-lock free, implementing Virtual Channels.

The DNP may host a small processor to implement some advanced features in an easy way, without resorting to complex VHDL coding while allowing for easy upgrade and bug fixing.

About programming models, the DNP will support three different kinds of network API: Systolic Communications, Remote Data Memory Access (RDMA) and Message Passing Interface (MPI).

Networks-on-Chip are mainly based on three kinds ofcomponent:

1) Routers

2) Network Interfaces

3) Physical Link

Each Router is point-to-point connected to threerouters and to one Network Interface.

NoCs hide the interconnect specific implementationdetails to the IP resources interfaced. All that isneeded for an external IP to transmit data throughthe network is a specific designed Network Interface.

SHAPES tiled hardware architecture

The SHAPES RDT Tile

The Interfaces of the Distributed Network Processor

mAgicV Floating-Point VLIW DSP

Spidergon NoC Topology