Nilesh Choudhury Parallel Programming Lab Department of Computer Science

Preview:

DESCRIPTION

Nilesh Choudhury Parallel Programming Lab Department of Computer Science University of Illinois, Urbana Champaign. InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks. Motivation. - PowerPoint PPT Presentation

Citation preview

InterConnection Network Topologies to Minimize graph diameter:

Low Diameter Regular graphs and Physical Wire Length Constrained

networks

Nilesh ChoudhuryParallel Programming Lab

Department of Computer ScienceUniversity of Illinois, Urbana Champaign

Motivation● Running a program on a large number of processors:

– Large number of partitions– Large amount of communication

● Communication is the most common bottleneck for scaling a problem to large number of machines

– Point to Point communication times increase● Average hop count has increased

– Collective communication times increase● Need to send larger number of messages● Average hop count has increased● Diameter has increased, so max. communication time is

larger– Might be limited by the maximum time

Solutions● Software communication Optimizations● Mapping your partitions (sequential entities) to

minimize communication● Interconnection Network itself could be

optimized– Minimize diameter of the network to decrease global

collective operations● Broadcast● Reduction

– Amount of bandwidth used● Minimize the total number of hops for all messages● Minimize average number of hops

Scope of this talk

● Interconnection Network design Optimizations● Routing Algorithms for these networks● Tradeoff between

– Average hop distance– Maximum hop distance (diameter)– Simplicity of routing algorithm

● Must be implemented in hardware (in as few clock cycles as possible)

Networks

● Direct Network– Each node is connected to a corresponding router– # routers = # nodes– Also called router-based networks

● Indirect Network– A number of nodes is connected to one switch– Fat-trees are an example

Network Parameters

● Degree of a node– Connectivity of the node

● Bisection bandwidth– The minimum bidirectional capacity of a network

between two equally sized partitions of the network● Diameter

– Length of the longest shortest path between any two nodes in the network

● Average length of shortest path between all pairs of nodes

Common Networks

● HyperCube (N nodes):– Degree = logN– Diameter = logN– Bisection BW = N/2– Avg. Internode Distance = (logN)/2

● Fat Tree (k-ary n-tree) (N=k^n nodes):– Degree = k– Diameter = 2n – Bisection BW = N– Avg. Internode Distance = n

Moore Graphs● Moore graph:

– N(d,k) <= (d(d-1)^k -2) / (d-2)● Very few graphs found that satisfy the

Moore bound● N(nodes, degree, diameter)● Petersen graph N(10, 3, 2)● Hoffman-Singleton graph N(50, 7, 2)● N(3250, 57, 2) – possible but yet

undiscovered

Low Diameter Regular Graph

● Each node has same degree logN as of a hypercube● Diameter of LDR is 2, that for hypercube is 3● Average Internode distance for LDR is 1.375,

while that for hypercube is 1.5

Diameter: LDR graph Vs Hypercube

How to generate a LDR graph?

● LDR graph is built based on a spanning tree● No. of nodes = N● Degree = k● Start with the root. Connect it to k children● For each of the children connect them to k-1

children (each has a parent)● Till we have used all N nodes● Leafs still have unconnected edges, which could

be used to decrease the diameter, etc.

How to generate a LDR graph? contd...● To choose the incomplete connections for leaves:

– Pick a vertex 'A' with max incomplete connections– Pick another vertex 'B' randomly from remaining– If (A->B) pick another vertex– Continue the previous step till there are no vertices

remaining which satisfy the condition or we find a legitimate vertex

– If we find a legitimate vertex, add an edgeA->B– Else, we disconnect some edge X-Y and connect A-X

and B-Y.

Building a LDR graph

Routing on a LDR graph● Hamming for hypercube is

simple (XOR op)● Deterministic for LDR

– shortest path routing● Table driven

● Need adaptivity in the presence of network contention!

● 64 / 2048 node LDR and hypercube using deterministic and adaptive routing

How difficult is it to place a hypercube / LDR in physical space?

● A hypercube with N nodes

– LogN dimensions● Real world is 3D● Difficult to place nodes in 'n' dimensions● Really big machines (Bluegene, RedStorm)

– Use 3D torus or similar– Easier to place them in physical space– Large wire lengths mean large delays, not to mention cost

● We believe for large machines, topology should consider physical placement

Framing the problem?● The problem

– a network with connectivity 'k';– maximum allowable wire length is 'd' (hops);– Design a network topology within these constraints,

with lowest diameter, average all pair internode distance– Also it should have a simple routing algorithm

Connected Graph

● In 2D:– X+, X-, Y+, Y- – These 4 connections provide connected graph

● In 3D:– X+, X-, Y+, Y-, Z+, Z- – These 6 connections provide connected graph

The proposed topology● The Remaining connections are to be used to

decrease the diameter and average all pair internode distance

● i=1;● Add diagonal connections of length 'd'/i along all

four directions● i *= 2;● Repeat the above step while total # connections

are less than 'k'

Higher dimensions connectivity of a single node

● The picture shows 2D connectivity of a single node

● d/(sqrt(2)); d/(2sqrt(2)); d/(4sqrt(2)); ....

● Similarily, for 3D, connectivity of a single node

● d/(sqrt(3)); d/(2sqrt(3)); d/(4sqrt(3)); ....

● We call these networks “PLCN” (Physical wire Length Constrained Networks)

Intuitive Proof● Intuitively, anything

along the diameter would be easy to reach

● sqrt(2)*max(x,y) hops is the max number of hops

● Once we reach a region by the longest hops, we explore the smaller region recursively

Optimal value of 'd'● For different values of i, hence k, we optimize the

value of 'd' which would minimize 'D'● 1D

– For i=1; D=L/d + d/2; d=sqrt(L)● 2D

– For i=1; D=sqrt(2)*L/d +d/sqrt(2); D= d=sqrt(2L); L*L grid

● 3D– For i=1; D=sqrt(3)*L/d +d/sqrt(3); d=sqrt(3L);

L*L*L grid● For i=2, the equations become sufficiently

complicated

Average shortest all pair hop distance for k=8; 'd'; d=1 and d=4

Average shortest all pair hop distance for k=8; d=6 and d=9

Routing

● Simple routing algorithm for different dimensions– Use the longest available hop towards the destination– Within this smaller region, use the above step

recursively– The final step is to reach it by using simple hops

along the lowest level connections along the axis● With some care, this could be easily coneverted

to adaptive minimal routing (if more than one shortest paths are available)

● Non-minimal routing

Conclusions

● Communication Topology is important● Message latency should be minimized for an

application to scale● Non-trivial networks like LDR and PLCN are not

so difficult to implement● Can drastically reduce average all pair shortest

distance and diameter