Upload
nyssa-garrison
View
48
Download
0
Embed Size (px)
DESCRIPTION
ネットワークコンピューティング論 Ⅱ. 平成 21 年度 後期 火曜 第2時限(10:40-12:10) 吉永 努(UEC ) [email protected]. 内 容. 分散・並列処理計算機における相互結合ネットワークとその上でのメッセージ・ルーティング技法などについて学ぶ 資料 http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html http://www.gap.upv.es/slides/AppendixE.html TA: Axida君 [email protected]. - PowerPoint PPT Presentation
Citation preview
NC論2 2
内 容• 分散・並列処理計算機における相互結合ネットワー
クとその上でのメッセージ・ルーティング技法などについて学ぶ
• 資料 http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html• http://www.gap.upv.es/slides/AppendixE.html
• TA: Axida君 [email protected]
3
References• T. M. Pinkston and J. Duato: Interconnection Networks,
Appendix E in Computer Architecture: A Quantitative Approach, 4th Edition, Morgan Kaufmann publishers (2006).
• J. Duato, S. Yalamanchili, L. Ni: Interconnection Networks- an Engineering Approach-,
IEEE CS press (1997)• 同第 2 版 , Morgan Kaufmann publishers (2003)• 富田眞治: 並列コンピュータ、昭晃堂( 1996 )• W.D. Dally, B. Towles: Principles and Practices of
Interconnection Networks, Morgan Kaufmann publishers (2003)
NC論2
NC論2 4
What is an interconnection Network?
• It is a programmable system that transports data between terminals, such as processors and memory.
• It is programmable in the sense that it makes different connections at different points.
• It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data.
NC論2 5
Interconnection Network (1/2)
P
M
Interconnection Network
Multicomputer
P
M
P
M
NC論2 6
Interconnection Network (2/2)
P
M
Interconnection Network
UMA type shared memory multiprocessor
It is also called dance-hall architecture.
P
M
P
M
NC論2 7
Trend
• Its performance is increasing with processor performance at a rate of 50% per year.
• Communication is a limiting factor in the performance of many modern systems.
• Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over.
NC論2 8
Computer Classifications (%) 2008/06 2007/06 2006/06
MPP 19.6 21.4 19.6
Cluster 80.0 74.6 72.8
Others 0.4 4.0 7.6
http://www.top500.org/
share of the TOP500 June, 2008 – June, 2006
NC論2 9
Examples of MPPsprocessor Topology #proc.
NECEarth-
Simulator
Original-
vector
1GHz
(8GFlops)
640 x 640 crossbar switch
640-node x 8 -way
= 5,120
IBM
BlueGene/L
PPC440 Dual core
0.7GHz
(2.8GFlops)
3D torus
+ Tree
32 x 32 x 32 x 2-way = 65,536
NC論2 10
Examples of clusters
processor Interconnect #cores
T2K
Hitachi Cluster
AMD Opteron QC 2.3 GHz(9.2GFlops) Myrinet
10G
12,288
IBM
Roadrunner
PowerXCell 8i
3.2 GHz
(12.8GFlops) Infiniband 122,400
NC論2 11
Networks for Supercomputers
• Earth simulator: 640 x 640 crossbar switch ( 640 x 8 5K processors )≒
• ASCI Q: Quadrics network (fat-tree)( 1024 SMPs x 4 x 3 12K processors )≒
• Cray X1: modified 2D torus (~ 4K processors )
• Cray Red Storm: 3D mesh (27 x 16 x 24 10K processors)≒
• IBM Blue Gene/L: 3D torus (64 x 32 x 32 = 64K nodes)
NC論2 12
Architecture vs. software
memory programming
UMA
(SMP) shared OpenMP
NUMA
(MPP)
distributed(not shared)
MPI
(message Passing Interface)
NC論2 13
Network Design (1/3)
• Performance: latency and throughput (bandwidth)
• Scalability: #processors vs. network, memory, I/O bandwidth
• Incremental expandability: small to maximum size
• Partitionability: netwrok may be partitioned for several users
NC論2 14
Network Design (2/3)
• Simplicity: simple design, higher clock frequency, easy to
use• Distance span: smaller system is preferred
for noise and cable delay, etc.• Physical constraints: packaging (pin count),
wiring(wire length), and maintenance (power consumption) should meet physical limitation.
NC論2 15
Network Design (3/3)
• Reliability: fault tolerant, reliable communication, hot swap
• Expected workload: robust performance over a wade range of traffic
conditions.
• Cost: trade-offs between cost and performance.
NC論2 16
Classifiction of Interconnection Networks
• Shared-Medium Networks– Local area networks (ethernet, token ring)– Backplane bus (e.g. SUN Gigaplane)
• Direct Networks (router-based)– mesh, torus, hypercube, tree, … etc.
• Indirect Networks (switch-based)
• Hybrid Networks
NC論2 17
Shared-Medium Networks (LAN)
• Arbitration that determines the mastership of the shared-medium network to resolve network access is needed.
• The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD).
• Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time.
NC論2 18
Shared-Medium Networks (Backplane bus)
• It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture.
• It is realized by printed lines on a circuit board by discrete wiring.
• Gigaplane in SUN Enterprise x000 server(1996): 2.6GB/s, 256 bits data, 42 bits address, 83.8MHz clock.
NC論2 19
Direct (static) Networks
• Consists of a set of nodes.
• Each node is directly connected to a subset of other nodes in the network.
• Examples:– 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine)
– 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E)
– Hypercube (CM1, CM2, nCUBE)
NC論2 20
Mesh topology
2D 3D
node
NC論2 21
Torus topology
2D
(4-ary 2-cube)3D
(3-ary 3-cube)
NC論2 22
Hypercube (binary n-cube)
4D
(2-ary 4-cube)
NC論2 23
tree
Binary tree fat tree x tree
NC論2 24
Hierarchical topology (1/2)
Pyramid
(Hierarchical 2D mesh)Hierarchical ring
NC論2 25
Hierarchical topology (2/2)
Cube-connected cycles RDT
(Recursive Diagonal Torus)
NC論2 26
Hypermesh (spaninng-bus hypercube)
Single or multiple buses
NC論2 27
Base-m n-cube (hyper-crossbar)
Base-8 3-cube (Toshiba Prodigy)
000 007
070 077
707
777770
8x8 crossbar
NC論2 28
Diameter and degrees (1/2)2D mesh
2D torus
3D torus
binary n-cube
#node N N N N = 2n
Diameter 2√ N √N √N log N
degree 4 4 6 log N
3
NC論2 29
Diameter and degrees (2/2)Base-m n-cube
CCC Binary tree
ring
#node N = mn
N = n2n
N N
Diameter logm N 3n/2 2log N N/2
degree logm N 3 3 2
3