Upload
kiley-pinkerman
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistage
Interconnection Networks
J. Duato1, I. Johnson2, J. Flich1, F. Naven2, P.J. García3, T. Nachiondo1
1Technical University of Valencia
Valencia, Spain
2Xyratex
Havant, UK
3University of Castilla-La Mancha
Albacete, Spain
The Eleventh International Symposium on High-Performance Computer Architecture, San Francisco, 2005
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
2
Outline
• Introduction
• Congestion and HOL blocking
• Why now?
• Why previous proposals are inadequate
• Proposal: RECN
• Performance evaluation
• Conclusions
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
3
Interconnection Networks
MPPs• Earth Simulator (640 vectorial CPUs)• ASCI Q (12,288 EV68 CPUs, Quadrics network)• BlueGene/L (65.535 nodes, each one 2 processors, 360 TFlops)
PC Clusters• Storage Area Network (SANs)
– Google (6.000 CPUs and 12.000 disks)
• Thunder (1.024 nodes each one 4 Itaniums/8GB)• Many data centers all around the world
ASCI QEarth Simulator Thunder
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
4
Network Throughput beyond Saturation
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
6
Congestion and HOL Blocking
Networkcontention
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
7
Congestion and HOL Blocking
Persistentnetworkcontention
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
8
Congestion and HOL Blocking
Persistentnetworkcontention
Flow control
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
9
Congestion and HOL Blocking
Persistentnetworkcontention
Congestionpropagates
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
10
Congestion and HOL Blocking
• Congestion introduces HOL blocking, and this may degrade network performance dramatically
33%
33%
HOL 33%
33%100%
33%
33%
33%
100%
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
11
Traditional Solution
Overdimensioning the network
Late
ncy
Injected traffic
CongestionzoneWorking
zone
Network bandwidth is much higher than the bandwidth requested by end nodes
Low link utilization
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
12
Why Congestion Management Now?
New problems arising:
System cost: Recent interconnects (Myrinet, InfiniBand, ASI) are
expensive compared to processors Power consumption: As network size increases, higher power
consumption, higher heat dissipation
Frequency/voltage scaling techniques: Not very efficient, and do not solve the system cost problem
Possible Solutions:
Reducing the number of network components: Possible by using a suitable topology, but link utilization increases
Systems will work closer to network saturation zone, thus,
a congestion management technique will be mandatory
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
13
Why Current Techniques Are not Suitable?
Proactive Congestion Management (congestion prevention)• Path setup before data transmission• Used in ATM, computer networks (QoS)• High overhead, high latencies (not suitable for HPC)
The real problem is not the congestion, but its negative effects (HOL blocking)
Reactive Congestion Management (congestion recovery)• Injection limitation techniques using closed-loop feedback• Do not scale with network size and link bandwidth
– Notification delay (proportional to distance)
– Link capacity (proportional to clock frequency)
– May produce network instabilities
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
14
Why Current Techniques Are not Suitable?
HOL blocking elimination/reduction• DAMQs and Virtual Channels
– not efficient for multihop networks
• VOQ (Virtual Output Queueing)– VOQ at switch level scales but does not eliminate HOL– VOQ at network level: A separate queue at every input port for every destination– Number of required resources scales at least quadratically with network
size !!!
• Credit Flow Controlled ATM– References congestion to network output only
– Consumes large number of buffers: A separate queue at every output port for every destination
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
15
Proposal
Initial idea:• Exploit spatial and temporal locality in packet destinations• Manage the set of queues as a cache
– No equivalent to main memory!!! (where to replace?)– Not enough locality!!! (reduction in queue silicon area by a factor of 4)
Observation:• Non-congested flows do not introduce significant HOL blocking
RECN: Regional Explicit Congestion Notification• Non-congested flows are mapped to the same queue• Effective reduction in number of queues and no replacement needed• Congested flows are detected and mapped to set aside queues (SAQs)
RECN is a scalable congestion management technique because:• It reacts locally (and thus, it is not affected by propagation delays)• A very small number of queues (SAQs) for a wide range of network sizes
RECN enables:• Effective reduction of network cost by working closer to the saturation point• More efficient use of voltage/frequency scaling techniques
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
16
RECN
Based on the PCI Express Advanced Switching Interconnect (ASI) specification
• Routing (turnpools)
Relevant switch architectural features• Congestion detection• Congestion notification and queue allocation• Queue deallocation• Packet processing• Flow control
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
17
Turnpools
0
1
2
3
4
5
6
7
3
7
turn poolt. pointerD
Direction bit
Turn example
31 bits = 231 destinations
AS packet header
2 1
122 11
3
AA
33 22B
B
2
Allows to know if a packetwill pass through a givenport in the network
Mask bits required
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
18
Switch Model
RAMin
XBARS=1.5
RAMin
RAMin
. . . . . .
RAMout
RAMout
RAMout
Arbiter
. . .
. . . . . .
Dynamic queuemanagement (VCs)
Dynamic queuemanagement (VCs)
LC
LC
LC
LC
LC
LC
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
19
RAM in and RAM out
RAMSAQ 0SAQ 1
SAQ 2
SAQ 3
Cold Queue
SAQ 0SAQ 1
SAQ 2
SAQ 3
Tokens (one per each input port)
Root
Only at egress:Avoids successive
internal notifications
vv
vv
vv
vv
turn poolturn pool
turn poolturn pool
turn poolturn pool
turn poolturn pool
mask bitsmask bits
mask bitsmask bits
mask bitsmask bits
mask bitsmask bits
CAM
SAQ 0SAQ 1
SAQ 2
SAQ 3bb
bb
bb
bb
Valid bit Congested pointblocked
nextSAQnextSAQ
nextSAQnextSAQ
nextSAQnextSAQ
nextSAQnextSAQ
Xon/XoffFlow control
XoffXoff
XoffXoff
XoffXoff
XoffXoff
lvlv
lvlv
lvlv
lvlv
leave bit (only at ingress)
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
20
A congestion point forms
How it Works
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
21
How it Works
Cold queue fills over a threshold
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
24
How it Works
Internal notification to each input port
sending packets to the output port
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
27
How it Works
Input ports allocate a new SAQ for
packets addressed tothe congested output port
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
32
How it Works
Notification sent whenthe SAQ fills
over a threshold
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
35
How it Works
A new SAQ allocatedfor the congested port
at each output port
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
36
How it Works
Internal notification when the SAQ fills over
A threshold
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
37
How it Works
The input port allocatesA new SAQ
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
38
How it Works
At the end, the congestion tree builds and is mapped
entirely onto SAQs
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
39
Performance Evaluation
•Evaluation based on simulation results
•Two evaluation studies:• Network performance when using:
– RECN– VOQ at network level (VOQnet)– VOQ at switch level (VOQsw)– 4 queues at ingress and egress ports (4Q)– 1 queue at ingress and egress ports (1Q)
• RECN scalability
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
40
Simulation Model
• Network configurations evaluated:• 64 hosts connected by a 64x64 BMIN• 256 hosts connected by a 256x256 BMIN• 512 hosts connected by a 512x512 BMIN
• Simulation assumptions:• BMINs based on shuffle-exchange connection scheme• Deterministic routing• 128 KB memories at ingress/egress ports• Multiplexed crossbar (BW=12 Gbps)• Serial full-duplex pipelined links (BW=8 Gbps)• 64 and 512-byte packets• Credit-based and Xon-Xoff (for SAQs) flow control• Maximum of 8 SAQs at ingress/egress ports (RECN)
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
41
Traffic Load
•Synthetic Traffic:
•Traces:• From I/O activity at cello system disk interface• Different compression factors applied
# Srcs Dst.Injection Rate (%)
Traffic Start Time
Traffic End Time
Corner Case 1
75% Random 50% 0 Sim. End
25% Hot-Spot 100% 800 μs 970 μs
Corner Case 2
75% Random 100% 0 Sim. End
25% Hot-Spot 100% 800 μs 970 μs
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
43
Performance Comparison
•Network throughput - Corner case 2, 64x64 BMIN
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
44
Performance Comparison
• Network throughput – Traces, 64x64 BMIN
Compression Factor set to 20 Compression Factor set to 40
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
45
Scalability Analysis
• SAQ utilization – Corner Case 1, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)
Total # of active SAQS
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
46
Scalability Analysis
• SAQ utilization – Corner Case 2, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)
Total # of active SAQS
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
47
Scalability Analysis
• SAQ utilization – Traces, Comp. Factor 20, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)
Total # of active SAQS
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
48
Scalability Analysis
• SAQ utilization – Traces, Comp. Factor 40, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)
Total # of active SAQS
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
49
Scalability Analysis
• Network throughput – Corner Case 2, 256x256 BMIN
Maximum # SAQs used (egress)Maximum # SAQs used (ingress)
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
50
Scalability Analysis
• Network throughput – Corner Case 2, 512x512 BMIN
Maximum # SAQs used (ingress) Maximum # SAQs used (egress)
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
51
Final Remarks
• We also designed a protocol to deallocate SAQs when they are no longer needed
• Many optimizations– CAM IDs to reduce control message size– CAM search done in parallel with packet reception– Merging of congestion trees
• Silicon area reduced with respect to switch-level VOQs
Tit
le:A
New
Sca
lab
le a
nd
Cost
-Eff
ect
ive C
on
gest
ion
Man
ag
em
en
t S
trate
gy f
or
Loss
less
Mu
ltis
tate
In
t.
Netw
ork
sC
on
fere
nce
: Th
e 1
1th
In
tern
ati
on
al S
ym
posi
um
on
Hog
h-P
erf
orm
an
ce C
om
pu
ter
Arc
hit
ect
ure
52
Conclusions
• We have proposed a scalable congestion management strategy for lossless networks
• We have shown that it only requires a small number of buffers for a wide range of network sizes
• We have modeled an existing ASI switch design, verifying:–Maintains network performance close to
ideal (but non-scalable) solution–Silicon area requirements are now smaller
than for the original design