35
Tightly-Coupled Multi- Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Embed Size (px)

Citation preview

Page 1: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Tightly-Coupled Multi-Layer

Topologies for 3D NoCs

Hiroki Matsutani (Keio Univ, JAPAN)Michihiro Koibuchi (NII, JAPAN)

Hideharu Amano (Keio Univ, JAPAN)

Page 2: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Outline• Network-on-Chip (NoC)

– Typical 2D topologies– 2D vs. 3D

• XNoTs– New class of 3D topologies– Definition, Examples– Deadlock-free routing

• Evaluations– Throughput– Area, Energy consumption

Page 3: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Network-on-Chip (NoC)• Tile architectures

– MIT RAW

– Texas U. TRIPS

– Intel 80-tile NoC

• Various topologies– Mesh, Torus, Tree– Large impact on en

ergy, cost, and performance

[Vangal, ISSCC’07]

[Buger, Computer’04]

[Taylor, Micro’02]

An example of tile architecture (ASPLA 90nm CMOS process)

Tile = Processing core + On-chip

router

Packet switched network

Page 4: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

2D Topologies: Mesh & Torus

Router Core

• 2-D Mesh • 2-D Torus– 2x bandwidth of meshRAW [Taylor, IEEE Micro’02]

Page 5: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

2D Topologies: Fat Tree

• Fat Tree (p, q, c)p: # of upward linksq: # of downward

linksc: # of core ports

Router Core

Fat Tree (2,4,2)Fat Tree (2,4,1)

Network topology should be carefully selected so as to meet the requirements of

application

Network topology should be carefully selected so as to meet the requirements of

application

Page 6: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

2D NoC vs. 3D NoC• 2D NoCs

– Long wires, distance– Wire delay– Packets consume

power at links according to their wire length

• 3D NoCs– Several small wafers

or dices are stacked

• Vertical link– Micro bump– Through-wafer via

– Very short (10-50um)

[Ezaki, ISSCC’04]

[Burns, ISSCC’01]

Long horizontal wires in 2D NoCs can be replaced by very short vertical links in 3D

NoCs

Long horizontal wires in 2D NoCs can be replaced by very short vertical links in 3D

NoCs

Page 7: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

3D NoCs that have heterogeneous tiers

• Different circuits on each tier

• Different topologies on each tier

Processor array

Cache memory

Custom logic

Tier-1

Tier-2

Tier-3Fat

Tree(2,4,1)

Ring

2D-Mesh

(*) A tier refers a wafer or a die in 3D ICs

How to connect different planar topologies?

How to route packets in heterogeneous 3D NoCs?

How to connect different planar topologies?

How to route packets in heterogeneous 3D NoCs?

We propose a class of topology for heterogeneous 3D NoCs

Page 8: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Outline• Network-on-Chip (NoC)

– Typical 2D topologies– 2D vs. 3D

• XNoTs– New class of 3D topologies– Definition, Examples– Deadlock-free routing

• Evaluations– Throughput– Area, Energy consumption

Multiple network layers are tightly connected by vertical crossbar

switches

Page 9: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Existing vertical link designs

• Vertical bus

• Merit– Small # of vertical link

• Demerit– Low peak performance

• Vertical crossbar

• Merit– Similar performance to tr

ue crossbar– Reasonable # of vertical l

inks

[Li, ISCA’06][Kim, ISCA’07]

We assume to use crossbar-based vertical link for 3D NoCs

Single bus (only a single transfer at the same

time)

Segmented buses (multiple transfers at the same time)

Page 10: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Xbar-connected Network-on-Tiers

• XNoTs: – Multiple planar

topologies– Connected by crossbars

• Network-on-Tier (NoT)– A planar topology– Implemented on a tier– Bottom NoT provides con

nectivity to all cores

Network-on-Tier

Network-on-Tier

Network-on-Tier

XNoTs

A mesh-based NoT

Each core and router have a port for a vertical

connection

Router Core

Page 11: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Xbar-connected Network-on-Tiers

• XNoTs: – Multiple planar

topologies– Connected by crossbars

• Network-on-Tier (NoT)– A planar topology– Implemented on a tier– Bottom NoT provides con

nectivity to all cores

A mesh-based NoT

Router CoreA mesh-based XNoTs

All routers and cores in a same pillar are connected by a crossbar

All routers and cores in a same pillar are connected by a crossbar

Vertical crossbar

pillar

Page 12: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Examples: all tiers have same topology

Mesh-based XNoTs Ring-based XNoTs Tree-based XNoTs

Page 13: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Side viewSide view Side view

Mesh-based XNoTs Ring-based XNoTs Tree-based XNoTs

All routers and cores in a same pillar are connected by a crossbar

All routers and cores in a same pillar are connected by a crossbar

Examples: all tiers have same topology

Page 14: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Examples: Heterogeneous XNoTs (1)

• Different topologies are used in each tier

Fat Tree(2,4,1)

Ring

2D-Mesh

Page 15: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Examples: Heterogeneous XNoTs (1)

Side view

Fat Tree(2,4,1)

Ring

2D-Mesh

• Different topologies are used in each tier

Page 16: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Packets are transferred via bottom tier (tier-1)

Packets are transferred via bottom tier (tier-1)

No connectivity

Examples: Heterogeneous XNoTs (2)

• All tiers cannot provide connectivity to all cores– Except for the bottom tier (i.e., “escape” tier)

Bottom tier (Full connectivity to all

cores)

Top tier (Some links are disconnected)

(*) Only the bottom tier must provide full connectivity to all cores

Page 17: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Examples: Heterogeneous XNoTs (2)

• All tiers cannot provide connectivity to all cores– Except for the bottom tier (i.e., “escape” tier)

Packets are transferred via bottom tier (tier-1)

Packets are transferred via bottom tier (tier-1)

Bottom tier (Full connectivity to all

cores)

Top tier (Some links are disconnected)

(*) Only the bottom tier must provide full connectivity to all cores

Page 18: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Deadlock-free routing

• Intra-tier comm. (X and Y directions)

– Existing deadlock-free routing is used within a tier– Only tier-0 must guarantee connectivity to all cores

• Inter-tier comm. (Z direction)

– Turns from lower-tier to higher-tier are prohibited– Unless the next hop is final destination

Top view Side viewMesh based XNoTs

E.g., dimension-order routing (DOR)

OK!NG!

Page 19: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Path selection (random)

• XNoTs routing– Multiple tiers are available Alternative paths are available

• Path selection policy– How to select a single path?– Random selection Good load balancing

5-hop

5-hop

5-hop

Top view Side viewMesh based XNoTs

We also proposed some policy based path selection policies. For more detail, please refer to the paper.

We also proposed some policy based path selection policies. For more detail, please refer to the paper.

Page 20: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Outline• Network-on-Chip (NoC)

– Typical 2D topologies– 2D vs. 3D

• XNoTs– New class of 3D topologies– Definition, Examples– Deadlock-free routing

• Evaluations– Throughput– Area, Energy consumption

Page 21: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Evaluation: Target topologies (64-core)

• X-Mesh– (4x4 Mesh) x 4 layers

• X-Torus– (4x4 Torus) x 4 layers

• X-FT141– Fat Tree(1,4,1) x 4 layers

• X-FT241– Fat Tree(2,4,1) x 4 layers

• X-FT441– Fat Tree(4,4,1) x 4 layers

X-Mesh

p: # of upward linksq: # of downward

linksc: # of core ports

Fat Tree (p, q, c)

These five topologies are compares with 3D Mesh/Torus

Page 22: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Throughput: Simulation environment

• Grid-based topologies– 3D-Mesh, X-Mesh– 3D-Torus, X-Torus– Dimension-order

routing

• Tree-based topologies– X-FT141, X-FT241– X-FT441– Up*/down* routing

• Path selection policy– Random

Packet size 16-flit (1-flit header)Buffer size 1-flit per channelSwitching Wormhole switching

Latency 3-cycle per 1-hopTraffic Uniform random

(Two virtual channels for tori)

X-Mesh (4x4x4)

Page 23: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Throughput: Simulation results

• X-Torus• X-Mesh

• X-FT441• X-FT241• X-FT141

Grid-based XNoTs Tree-based XNoTs

No degradation (X-Mesh = 3D-Mesh, X-Torus = 3D-Torus)

• 3D-Torus• 3D-Mesh

• 3D-Torus• 3D-Mesh

Page 24: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Network logic area• Network area

– Routers & NIs– Inter-tier vias

• Synthesis of NoC– 64-core (16-core x 4)– 0.18um CMOS

• Router architecture– 1-flit = 32-bit– Wormhole switching– 4-stage pipeline

• Inter-tier vias– 1-10um square– 25um per layer per 1-

bit signal

[Li, ISCA’06][Burns, ISSCC’01]

2

Inter-tier via area is calculated according to # of vertical links

Inter-tier via area is calculated according to # of vertical links

CrossbarInput Ports

Buf

Buf

Arbiter

Typical wormhole router [Matsutani, IPDPS’07]

Page 25: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Network logic area: Results

Network logic area [mm ]

3D Mesh/Torus require 2-port for vertical (i.e., up & down)

XNoTs require only 1-port for vertical (but # of xbar increases)

2

• Synthesis of NoC– 64-core (16-core x 4)– 0.18um CMOS

• Router architecture– 1-flit = 32-bit– Wormhole switching– 4-stage pipeline

• Inter-tier vias– 1-10um square– 25um per layer per 1-

bit signal

[Li, ISCA’06][Burns, ISSCC’01]

2

Inter-tier via area is calculated according to # of vertical links

Inter-tier via area is calculated according to # of vertical links

Page 26: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Energy: NoC’s energy model

• Ave. flit energy– Send 1-flit to dest.– How much energy[J] ?

• Parameters– 6mm square chip– 64-core (16-core x 4)– 0.18um CMOS

• Switching energy– 1-bit switching @ Router– Gate-level sim– 1.13 [pJ / hop]

• Link energy– 1-bit transfer @ Link– 0.67 [pJ / mm]

• Via energy– 4.34 [fF / via]

flitE

swE

linkE

)( linkswaveflit EEHwE

6mm

[Davis, DToC’05]

Page 27: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Energy: Simulation results

• Parameters– 6mm square chip– 64-core (16-core x 4)– 0.18um CMOS

• Switching energy– 1-bit switching @ Router– Gate-level sim– 1.13 [pJ / hop]

• Link energy– 1-bit transfer @ Link– 0.67 [pJ / mm]

• Via energy– 4.34 [fF / via]

swE

linkE

[Davis, DToC’05]

Ave. Flit energy [pJ]

Hop count is short in XNoTs low

power

flitE

Page 28: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Summary: 3D topologies - XNoTs

• Requirements– Different circuits on each layer– Different topologies on each layer– How to connect/route them?

• XNoTs– Tiers are connected by crossbars– Arbitrary tiers can be stacked

• Current problem / future work– We assumed full crossbar as a

baseline– More efficient implementation has

been proposed by– We must revise router

architecture

[Kim, ISCA’07]

Fat Tree

Ring

2D-Mesh

Page 29: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Thank you for your attention

Page 30: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)
Page 31: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Path selection (QoS)• Control packets

– In-order delivery is required

• Data packets– In-order delivery is

not required– Large data streams

XNoTs (Side view)

Dimension-order (deterministic)

Duato’s Protocol (adaptive)

Duato’s Protocol (adaptive)

Control packets use

tier-1

Deterministic routing

Adaptive routing

Page 32: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Path selection (QoS)• Control packets

– In-order delivery is required

• Data packets– In-order delivery is

not required– Large data streams

Deterministic routing

Adaptive routing

Dimension-order (deterministic)

Duato’s Protocol (adaptive)

Duato’s Protocol (adaptive)

XNoTs (Side view)Various QoS controls are possible by path selection algorithm

Data packets use tier-2 or

tier-3

Page 33: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

XNoTs: Path selection (bottom first)

• Heat dissipation is crucial in 3D ICs• Bottom tier

– Close to the board (good heat dissipation property)

• Bottom tier first– Tier-0 is firstly used if there are alternative paths

XNoTs (Side View)

board as heat-sink

3D IC

Bottom tier

Page 34: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

Ideal throughput: Channel bisection

• Number of unidirectional links that cross bisection

N-core × n-tier 1-tier 2-tier 4-tier

X-Mesh 8 16 32X-Torus 16 32 64X-FT141 4 8 16X-FT241 8 16 32X-FT441 16 32 643D-Mesh 8 16 323D-Torus 16 32 64

iiN 22

),2min( 1 nNni

),2min( 2 nNni

),4min( nNn

),2min( 1 nNni

),4min( nNni

),2min( 1 nNni

),2min( 2 nNni

16N

No degradation (X-Mesh = 3D-Mesh, X-Torus = 3D-Torus)

Page 35: Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)

3D Topologies: 3D-Mesh

3D-Mesh (4x4x4=64)

Average hop count: 5.33Channel bisection: 16Number of routers: 64Node degree: 5

Average hop count: 4.00Channel bisection: 32Number of routers: 64Node degree: 7

2D-Mesh (8x8=64)

Tier-0

Tier-1

Tier-2

Tier-3