Upload
hoanghanh
View
233
Download
0
Embed Size (px)
Citation preview
Pablo Abad, Pablo Prieto, Lucia Menezo, Adrian Colaso, Valentin Puente, Jose-Angel Gregorio
University of Cantabria
1 TOPAZ@NOCS12
TOPAZ: An Open-Source Interconnection Network Simulator
for Chip Multiprocessors and Supercomputers
2
Interconnects Research: Simulation Tool
What makes a Simulation Tool better than others?
-Heterogeneous field, from supercomputers to CMP. - Highly Configurable
Flexibility
- Avoid slow simulations for first stages of research process. - But provide accurate enough results at last stages.
Accuracy Vs. Comp. Effort
-Fast learning is essential. - MAX: 1-day delay for user-mode
Ease-of-Use
TOPAZ • Interfaz to Full-System Simulation.
• Multithreaded simulation for
massive number of routers.
•Simple & Complex models.
• Dynamic accuracy simulation.
•Many out-of-the-box components.
• Very modular code, easy to
understand.
TOPAZ@NOCS12
• Simulator Description
• Out-of-the-Box
• Utilization Examples
• Support & Collaboration
3
Outline
TOPAZ@NOCS12
• Evolution of SICOSYS
• Object-oriented Design
• Different levels of detail
• Support for parallel execution
4
Main Features
- Implemented in C++ - 100 classes / 50,000 lines of code - High portability (C++ standard compiler)
[REF] V.Puente, J.A. Gregorio, R. Beivide, SICOSYS: an integrated framework for studying interconnection
network performance in multiprocessor systems. IEEE Comput. Soc, 2000.
TOPAZ@NOCS12
• Evolution of SICOSYS
• Object-oriented Design
• Different levels of detail
• Support for parallel execution
5
Main Features
Injector
Consumer
Buffer Crossbar
Rtg. &
Arb.
N
S
W
SIMPLE ROUTER -1-C++ class description - (+) Fast Simulation - (--) Accuracy
DETAILED ROUTER -C++ class per component - (--) Slower Simulation - (++) Higher Accuracy
T1 T2 T3
TOPAZ@NOCS12
Network.sgm
Simula.sgm
Router.sgm
6
Using TOPAZ (Building)
>./TPZSimul –s SIMUL_DETAILED
TPZSimul.ini
<RouterFile id="../sgm/Router.sgm" > <NetworkFile id="../sgm/Network.sgm" > <SimulationFile id="../sgm/Simula.sgm" >
<Simulation id="SIMUL_DETAILED"> <Network id="TORUS"> <SimulationCycles id=1000000> <DiscardTraffic id=10000> <TrafficPattern id="MODAL" type=”RANDOM”> <Load id=0.5> <PacketLength id=2> </Simulation>
<TorusNetwork id="TORUS" sizeX=8 sizeY=8 router="DETAILED" delay=1> <MeshNetwork id="MESH" sizeX=8 sizeY=8 router="DETAILED" delay=1>
<Router id="DETAILED" inputs=5 outputs=5 bufferSize=64 bufferControl=CT routingControl="ROUTING_ALG"> <Injector id="INJ"> <Consumerid="CONS"> <Buffer id="BUF1" type="X+" headerDelay=2> <Buffer id="BUF2" type="X-" headerDelay=2> . . . . . . . . . . . <Buffer id="BUF5" type="Node" headerDelay=2> <Routing id="RTG1" type="X+" headerDelay=1> . . . . . . . . . . . <Routing id="RTG5" type="Node" headerDelay=1> <Crossbar id="XBAR" inputs="5" outputs="5" type="CT"> <Input id=1 type="X+"> . . . . . . . . . . . <Output id=5 type="Node"> </Crossbar> <Connection id="C01" source="INJ" destination="BUF5"> . . . . . . . . . . . <Connection id="C20" source="RTG.1" destination="XBAR.1"> . . . . . . . . . . . </Router>
TOPAZ@NOCS12
7
Using TOPAZ (Printing) Standalone
Throughput/Latency curves
+ Orion + Gems (or Gem5)
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
Acce
pte
d L
oa
d
(fli
t/cyc/rt
er)
Applied Load (flits/cycle/router)
RR ABR VCR
0
100
200
300
400
500
0 0,2 0,4 0,6 0,8 1
To
tal
La
ten
cy
(cycle
s)
Applied Load (flits/cycle/router)
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,1
0,11
0 50 100 150 200 250 300 350 400 450 500
Tra
ffic
fra
cti
on
(%
)
Network Latency (cycles)
6 Turns
1 Turn
0 1 2 3 4 5 6 7
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0 1 2 3 4 5 6 7 Y Position
Lin
k U
tili
za
tio
n
X Position
0,6-0,7 0,5-0,6 0,4-0,5 0,3-0,4
0
0,2
0,4
0,6
0,8
1
0 100 200 300 400 500
Th
rou
gh
pu
t
Cycles simulated
Integer Sort
Latency Histogram Injection/Consumption/Link map
Throughput/Latency evolution
Link
Crossbar
Buffer
Arbiter
Power Breakdown
TOPAZ@NOCS12
• Simulator Description
• Out-of-the-Box
• Utilization Examples
• Support & Collaboration
8
Outline
TOPAZ@NOCS12
9
Out of the Box 1. Configuration Parameters
Router
Buffer Size
Buffer Delay
Packet Size
# Virtual Channels
#Physical networks
Message Types
Router Pipeline
Link Delay
Flow Control
Virtual Cut Through
Bubble Flow Control
Wormhole
Virtual Channel flow Control
Traffic
Random
Bit-Reversal
Perfect-Shuffle
Transpose Matrix
Tornado
Hot-Spot
Local
Trace-Based
Topology
Ring
Mesh (2D & 3D)
Torus (2D & 3D)
Midimew (2D)
Square Midimew (2D)
TOPAZ@NOCS12
10
Out of the Box 2. Available Routers
Router REF Year Level of Detail
Adaptive Bubble Router
Deterministic Bubble Router
[14]
[15]
2001
1998
Complex & simple
Complex & simple
Rotary Router
Bufferless Router
Bidirectional Router
[19]
[21]
[22]
2007
2010
2009
Complex
Simple
Simple
Buffered Crossbar [23] 1987 Complex
Deterministic with VC (Dally)
VCTM (Dally + MC Support)
[16][17]
[18]
2001
2008
Complex & simple
Complex & simple
Pipeline Optimized [24] 2008 Complex & simple
TOPAZ@NOCS12
11
Out of the Box 3. Integration with Full-System Simulation Tools
Simics
Opal (processor)
M5 (processor)
Ruby (Memory) Topaz( Network)
Wisconsin Multifacet Gems: http://research.cs.wisc.edu/gems/
Gem5 simulator system: http://gem5.org/Main_Page
TOPAZ@NOCS12
• Simulator Description
• Out-of-the-Box
• Utilization Examples
• Support & Collaboration
12
Outline
TOPAZ@NOCS12
13
Increasing Full-System simulation accuracy
Main System Parameters System Network
Cores 16 Cores, @4GHz, OOO, 4-wide issue, 64-entry IW, 16 outstanding Mem. Req
L2 16 MB, SNUCA, Token(B) coherence protocol, 6 msg. dependence chain
Topology 4x4 Mesh
L1 Independent I/D caches, 32KB, 4-way, 1 cycles
L2 Bank 1MB, 16-way, 5 cycles, pseudo LRU Links 1 cycle, 128bits wide
Memory 4GB, 320GB/s, 260 cycles OS Solaris 10
Broadcast Coherence Protocol (Execution Time)
TOPAZ@NOCS12
0
0,5
1
1,5
2
2,5
RU
BY
No
rma
lize
d
Ex
ecu
tio
n T
ime
RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX
14
Increasing Full-System simulation accuracy
Execution Time
TOPAZ@NOCS12
0
0,5
1
1,5
2
2,5
RU
BY
No
rma
lize
d
Ex
ecu
tio
n T
ime
RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX
0
0,2
0,4
0,6
0,8
1
1,2
No
rma
lize
d C
ycle
s
Sim
ua
lte
d/se
cco
nd
RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX
Simulation speed (cycles/second)
More Accuracy => Slower simulations On average, Ruby is ≈ 2X faster
15
Improving Simulation Speed (I)
TOPAZ@NOCS12
0
0,5
1
1,5
2
2,5
RU
BY
No
rma
lize
d
Ex
ecu
tio
n T
ime
RUBY
TOPAZ_SIMPLE
TOPAZ_COMPLEX
(AI)TOPAZ_SIMPLE
0
0,2
0,4
0,6
0,8
1
1,2
No
rma
lize
d C
ycle
s
Sim
ua
lte
d/se
cco
nd
RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX (AI)TOPAZ_SIMPLE (AI)TOPAZ_COMPLEX
Execution Time
Simulation speed (cycles/second)
Adaptive Interface R
UB
Y
TO
PA
Z
RU
BY
0
0,2
0,4
0,6
0,8
1
0 100 200 300 400 500
Th
rou
gh
pu
t
M Cycles simulated
Integer Sort
0
0,2
0,4
0,6
0,8
1
1,2
1,4
No
rma
lize
d C
ycle
s
Sim
ua
lte
d/se
cco
nd
RUBY TOPAZ_SIMPLE TOPAZ_COMPLEX (AI)TOPAZ_SIMPLE (AI)TOPAZ_COMPLEX (P)TOPAZ_COMPLEX (P)TOPAZ_SIMPLE
16
Improving Simulation Speed (II)
TOPAZ@NOCS12
0
0,5
1
1,5
2
2,5
RU
BY
No
rma
lize
d
Ex
ecu
tio
n T
ime
RUBY
TOPAZ_SIMPLE
TOPAZ_COMPLEX
(AI)TOPAZ_SIMPLE
Execution Time
Simulation speed (cycles/second)
2-Thread Simulation
T1 T2
17
Simulating thousand-node Networks
0E+00
1E+05
2E+05
3E+05
4E+05
5E+05
6E+05
1 3 5 7 9 11
Sim
ula
tio
n T
ime
(S
eco
nd
s)
Number of Cores
32K Rotuers
128K Routers
256K Routers
512K Routers
1M Routers
12-Core ( Xeon E5645) server with 54GBytes of main memory.
1.5GB
5.5GB 12GB 24GB
49GB
• 3D Torus, Bubble Router (simple), similar to IBM Blue Gene.
• Multithreaded implementation takes advantage of multicore server • Good speedup for 1 Million routers
TOPAZ@NOCS12
• Simulator Description
• Out-of-the-Box
• Utilization Examples
• Support & Collaboration
18
Outline
TOPAZ@NOCS12
19
Support & Collaboration
TOPAZ@NOCS12
code.google.com/p/tpzsimul
20
Support & Collaboration
TOPAZ@NOCS12
Thanks for your attention
Questions?
21
http://www.atc.unican.es/galerna/index.html
TOPAZ@NOCS12
GARNET
22
0
0,5
1
1,5
2
2,5
3
RU
BY
No
rma
lize
d E
xe
cu
tio
n T
ime
RUBY GARNET TOPAZ_SIMPLE TOPAZ_COMPLEX
MRR@HPCA09 23
T1 T2 T3 T4
24
Using TOPAZ
BUILDING RUNNING PRINTING
TOPAZ@NOCS12
25
Using TOPAZ (Building)
Router.sgm
Crossbar
-Router & Crossbar Ports - Buffer Size - Routing & Flow Control Policies
Network.sgm
TOPAZ@NOCS12
26
Using TOPAZ (Building)
Router.sgm -Router & Crossbar Ports - Buffer Size - Routing & Flow Control Policies
Network.sgm -Network Size - Network Topology - Link Delay
Simula.sgm -Traffic Pattern - Message Size - Simulation Cycles
TOPAZ@NOCS12
27
Using TOPAZ (Running)
No need to re-compile - Only need to add new configutations at sgml files. - Each configuration identified by a tag at Simula.sgm. Option –s at command line to choose a specific configuration.
Different Execution Modes: - Run your simulation for XXX Cycles. - Run your simulation until YYY Messages reach their destination.
Command Line Options: - Many sgml parameters can be overwritten through command line options. - Example: - Useful for scripting.
TOPAZ@NOCS12