27
Pablo Abad, Pablo Prieto, Lucia Menezo, Adrian Colaso, Valentin Puente, Jose-Angel Gregorio University of Cantabria 1 TOPAZ@NOCS12 TOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers

TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

Embed Size (px)

Citation preview

Page 1: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

Pablo Abad, Pablo Prieto, Lucia Menezo, Adrian Colaso, Valentin Puente, Jose-Angel Gregorio

University of Cantabria

1 TOPAZ@NOCS12

TOPAZ: An Open-Source Interconnection Network Simulator

for Chip Multiprocessors and Supercomputers

Page 2: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

2

Interconnects Research: Simulation Tool

What makes a Simulation Tool better than others?

-Heterogeneous field, from supercomputers to CMP. - Highly Configurable

Flexibility

- Avoid slow simulations for first stages of research process. - But provide accurate enough results at last stages.

Accuracy Vs. Comp. Effort

-Fast learning is essential. - MAX: 1-day delay for user-mode

Ease-of-Use

TOPAZ • Interfaz to Full-System Simulation.

• Multithreaded simulation for

massive number of routers.

•Simple & Complex models.

• Dynamic accuracy simulation.

•Many out-of-the-box components.

• Very modular code, easy to

understand.

TOPAZ@NOCS12

Page 3: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

• Simulator Description

• Out-of-the-Box

• Utilization Examples

• Support & Collaboration

3

Outline

TOPAZ@NOCS12

Page 4: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

• Evolution of SICOSYS

• Object-oriented Design

• Different levels of detail

• Support for parallel execution

4

Main Features

- Implemented in C++ - 100 classes / 50,000 lines of code - High portability (C++ standard compiler)

[REF] V.Puente, J.A. Gregorio, R. Beivide, SICOSYS: an integrated framework for studying interconnection

network performance in multiprocessor systems. IEEE Comput. Soc, 2000.

TOPAZ@NOCS12

Page 5: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

• Evolution of SICOSYS

• Object-oriented Design

• Different levels of detail

• Support for parallel execution

5

Main Features

Injector

Consumer

Buffer Crossbar

Rtg. &

Arb.

N

S

W

SIMPLE ROUTER -1-C++ class description - (+) Fast Simulation - (--) Accuracy

DETAILED ROUTER -C++ class per component - (--) Slower Simulation - (++) Higher Accuracy

T1 T2 T3

TOPAZ@NOCS12

Page 6: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

Network.sgm

Simula.sgm

Router.sgm

6

Using TOPAZ (Building)

>./TPZSimul –s SIMUL_DETAILED

TPZSimul.ini

<RouterFile id="../sgm/Router.sgm" > <NetworkFile id="../sgm/Network.sgm" > <SimulationFile id="../sgm/Simula.sgm" >

<Simulation id="SIMUL_DETAILED"> <Network id="TORUS"> <SimulationCycles id=1000000> <DiscardTraffic id=10000> <TrafficPattern id="MODAL" type=”RANDOM”> <Load id=0.5> <PacketLength id=2> </Simulation>

<TorusNetwork id="TORUS" sizeX=8 sizeY=8 router="DETAILED" delay=1> <MeshNetwork id="MESH" sizeX=8 sizeY=8 router="DETAILED" delay=1>

<Router id="DETAILED" inputs=5 outputs=5 bufferSize=64 bufferControl=CT routingControl="ROUTING_ALG"> <Injector id="INJ"> <Consumerid="CONS"> <Buffer id="BUF1" type="X+" headerDelay=2> <Buffer id="BUF2" type="X-" headerDelay=2> . . . . . . . . . . . <Buffer id="BUF5" type="Node" headerDelay=2> <Routing id="RTG1" type="X+" headerDelay=1> . . . . . . . . . . . <Routing id="RTG5" type="Node" headerDelay=1> <Crossbar id="XBAR" inputs="5" outputs="5" type="CT"> <Input id=1 type="X+"> . . . . . . . . . . . <Output id=5 type="Node"> </Crossbar> <Connection id="C01" source="INJ" destination="BUF5"> . . . . . . . . . . . <Connection id="C20" source="RTG.1" destination="XBAR.1"> . . . . . . . . . . . </Router>

TOPAZ@NOCS12

Page 7: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

7

Using TOPAZ (Printing) Standalone

Throughput/Latency curves

+ Orion + Gems (or Gem5)

0

0,2

0,4

0,6

0,8

1

0 0,2 0,4 0,6 0,8 1

Acce

pte

d L

oa

d

(fli

t/cyc/rt

er)

Applied Load (flits/cycle/router)

RR ABR VCR

0

100

200

300

400

500

0 0,2 0,4 0,6 0,8 1

To

tal

La

ten

cy

(cycle

s)

Applied Load (flits/cycle/router)

0

0,01

0,02

0,03

0,04

0,05

0,06

0,07

0,08

0,09

0,1

0,11

0 50 100 150 200 250 300 350 400 450 500

Tra

ffic

fra

cti

on

(%

)

Network Latency (cycles)

6 Turns

1 Turn

0 1 2 3 4 5 6 7

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0 1 2 3 4 5 6 7 Y Position

Lin

k U

tili

za

tio

n

X Position

0,6-0,7 0,5-0,6 0,4-0,5 0,3-0,4

0

0,2

0,4

0,6

0,8

1

0 100 200 300 400 500

Th

rou

gh

pu

t

Cycles simulated

Integer Sort

Latency Histogram Injection/Consumption/Link map

Throughput/Latency evolution

Link

Crossbar

Buffer

Arbiter

Power Breakdown

TOPAZ@NOCS12

Page 8: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

• Simulator Description

• Out-of-the-Box

• Utilization Examples

• Support & Collaboration

8

Outline

TOPAZ@NOCS12

Page 9: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

9

Out of the Box 1. Configuration Parameters

Router

Buffer Size

Buffer Delay

Packet Size

# Virtual Channels

#Physical networks

Message Types

Router Pipeline

Link Delay

Flow Control

Virtual Cut Through

Bubble Flow Control

Wormhole

Virtual Channel flow Control

Traffic

Random

Bit-Reversal

Perfect-Shuffle

Transpose Matrix

Tornado

Hot-Spot

Local

Trace-Based

Topology

Ring

Mesh (2D & 3D)

Torus (2D & 3D)

Midimew (2D)

Square Midimew (2D)

TOPAZ@NOCS12

Page 10: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

10

Out of the Box 2. Available Routers

Router REF Year Level of Detail

Adaptive Bubble Router

Deterministic Bubble Router

[14]

[15]

2001

1998

Complex & simple

Complex & simple

Rotary Router

Bufferless Router

Bidirectional Router

[19]

[21]

[22]

2007

2010

2009

Complex

Simple

Simple

Buffered Crossbar [23] 1987 Complex

Deterministic with VC (Dally)

VCTM (Dally + MC Support)

[16][17]

[18]

2001

2008

Complex & simple

Complex & simple

Pipeline Optimized [24] 2008 Complex & simple

TOPAZ@NOCS12

Page 11: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

11

Out of the Box 3. Integration with Full-System Simulation Tools

Simics

Opal (processor)

M5 (processor)

Ruby (Memory) Topaz( Network)

Wisconsin Multifacet Gems: http://research.cs.wisc.edu/gems/

Gem5 simulator system: http://gem5.org/Main_Page

TOPAZ@NOCS12

Page 12: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

• Simulator Description

• Out-of-the-Box

• Utilization Examples

• Support & Collaboration

12

Outline

TOPAZ@NOCS12

Page 13: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

13

Increasing Full-System simulation accuracy

Main System Parameters System Network

Cores 16 Cores, @4GHz, OOO, 4-wide issue, 64-entry IW, 16 outstanding Mem. Req

L2 16 MB, SNUCA, Token(B) coherence protocol, 6 msg. dependence chain

Topology 4x4 Mesh

L1 Independent I/D caches, 32KB, 4-way, 1 cycles

L2 Bank 1MB, 16-way, 5 cycles, pseudo LRU Links 1 cycle, 128bits wide

Memory 4GB, 320GB/s, 260 cycles OS Solaris 10

Broadcast Coherence Protocol (Execution Time)

TOPAZ@NOCS12

0

0,5

1

1,5

2

2,5

RU

BY

No

rma

lize

d

Ex

ecu

tio

n T

ime

RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX

Page 14: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

14

Increasing Full-System simulation accuracy

Execution Time

TOPAZ@NOCS12

0

0,5

1

1,5

2

2,5

RU

BY

No

rma

lize

d

Ex

ecu

tio

n T

ime

RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX

0

0,2

0,4

0,6

0,8

1

1,2

No

rma

lize

d C

ycle

s

Sim

ua

lte

d/se

cco

nd

RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX

Simulation speed (cycles/second)

More Accuracy => Slower simulations On average, Ruby is ≈ 2X faster

Page 15: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

15

Improving Simulation Speed (I)

TOPAZ@NOCS12

0

0,5

1

1,5

2

2,5

RU

BY

No

rma

lize

d

Ex

ecu

tio

n T

ime

RUBY

TOPAZ_SIMPLE

TOPAZ_COMPLEX

(AI)TOPAZ_SIMPLE

0

0,2

0,4

0,6

0,8

1

1,2

No

rma

lize

d C

ycle

s

Sim

ua

lte

d/se

cco

nd

RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX (AI)TOPAZ_SIMPLE (AI)TOPAZ_COMPLEX

Execution Time

Simulation speed (cycles/second)

Adaptive Interface R

UB

Y

TO

PA

Z

RU

BY

0

0,2

0,4

0,6

0,8

1

0 100 200 300 400 500

Th

rou

gh

pu

t

M Cycles simulated

Integer Sort

Page 16: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

0

0,2

0,4

0,6

0,8

1

1,2

1,4

No

rma

lize

d C

ycle

s

Sim

ua

lte

d/se

cco

nd

RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX (AI)TOPAZ_SIMPLE (AI)TOPAZ_COMPLEX (P)TOPAZ_COMPLEX (P)TOPAZ_SIMPLE

16

Improving Simulation Speed (II)

TOPAZ@NOCS12

0

0,5

1

1,5

2

2,5

RU

BY

No

rma

lize

d

Ex

ecu

tio

n T

ime

RUBY

TOPAZ_SIMPLE

TOPAZ_COMPLEX

(AI)TOPAZ_SIMPLE

Execution Time

Simulation speed (cycles/second)

2-Thread Simulation

T1 T2

Page 17: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

17

Simulating thousand-node Networks

0E+00

1E+05

2E+05

3E+05

4E+05

5E+05

6E+05

1 3 5 7 9 11

Sim

ula

tio

n T

ime

(S

eco

nd

s)

Number of Cores

32K Rotuers

128K Routers

256K Routers

512K Routers

1M Routers

12-Core ( Xeon E5645) server with 54GBytes of main memory.

1.5GB

5.5GB 12GB 24GB

49GB

• 3D Torus, Bubble Router (simple), similar to IBM Blue Gene.

• Multithreaded implementation takes advantage of multicore server • Good speedup for 1 Million routers

TOPAZ@NOCS12

Page 18: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

• Simulator Description

• Out-of-the-Box

• Utilization Examples

• Support & Collaboration

18

Outline

TOPAZ@NOCS12

Page 19: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

19

Support & Collaboration

TOPAZ@NOCS12

code.google.com/p/tpzsimul

Page 20: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

20

Support & Collaboration

TOPAZ@NOCS12

Page 21: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

Thanks for your attention

Questions?

21

http://www.atc.unican.es/galerna/index.html

TOPAZ@NOCS12

Page 22: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

GARNET

22

0

0,5

1

1,5

2

2,5

3

RU

BY

No

rma

lize

d E

xe

cu

tio

n T

ime

RUBY GARNET TOPAZ_SIMPLE TOPAZ_COMPLEX

Page 23: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

MRR@HPCA09 23

T1 T2 T3 T4

Page 24: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

24

Using TOPAZ

BUILDING RUNNING PRINTING

TOPAZ@NOCS12

Page 25: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

25

Using TOPAZ (Building)

Router.sgm

Crossbar

-Router & Crossbar Ports - Buffer Size - Routing & Flow Control Policies

Network.sgm

TOPAZ@NOCS12

Page 26: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

26

Using TOPAZ (Building)

Router.sgm -Router & Crossbar Ports - Buffer Size - Routing & Flow Control Policies

Network.sgm -Network Size - Network Topology - Link Delay

Simula.sgm -Traffic Pattern - Message Size - Simulation Cycles

TOPAZ@NOCS12

Page 27: TOPAZ: An Open-Source Interconnection Network Simulator ... · Interconnection Network Simulator for Chip Multiprocessors and Supercomputers . 2 ... Throughput/Latency curves + Gems

27

Using TOPAZ (Running)

No need to re-compile - Only need to add new configutations at sgml files. - Each configuration identified by a tag at Simula.sgm. Option –s at command line to choose a specific configuration.

Different Execution Modes: - Run your simulation for XXX Cycles. - Run your simulation until YYY Messages reach their destination.

Command Line Options: - Many sgml parameters can be overwritten through command line options. - Example: - Useful for scripting.

TOPAZ@NOCS12