41
D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Embed Size (px)

Citation preview

Page 1: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN)

Annex A (June 7th 2006)

Ian Johnson

Xyratex

Page 2: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistage

Interconnection Networks

J. Duato1, I. Johnson2, J. Flich1, F. Naven2, P.J. García3, T. Nachiondo1

1Technical University of Valencia

Valencia, Spain

2Xyratex

Havant, UK

3University of Castilla-La Mancha

Albacete, Spain

The Eleventh International Symposium on High-Performance Computer Architecture, San Francisco, 2005

Page 3: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

3

Outline

• Introduction

• Congestion and HOL blocking

• Why now?

• Why previous proposals are inadequate

• Proposal: RECN

• Performance evaluation

• Conclusions

Page 4: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

4

Interconnection Networks

MPPs• Earth Simulator (640 vectorial CPUs)• ASCI Q (12,288 EV68 CPUs, Quadrics network)• BlueGene/L (65.535 nodes, each one 2 processors, 360 TFlops)

PC Clusters• Storage Area Network (SANs)

– Google (6.000 CPUs and 12.000 disks)

• Thunder (1.024 nodes each one 4 Itaniums/8GB)• Many data centers all around the world

ASCI QEarth Simulator Thunder

Page 5: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

5

Network Throughput beyond Saturation

Page 6: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

7

Congestion and HOL Blocking

Networkcontention

Page 7: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

8

Congestion and HOL Blocking

Persistentnetworkcontention

Page 8: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

9

Congestion and HOL Blocking

Persistentnetworkcontention

Flow control

Page 9: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

10

Congestion and HOL Blocking

Persistentnetworkcontention

Congestionpropagates

Page 10: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

11

Congestion and HOL Blocking

• Congestion introduces HOL blocking, and this may degrade network performance dramatically

33%

33%

HOL 33%

33%100%

33%

33%

33%

100%

Page 11: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

12

Traditional Solution

Overdimensioning the network

Late

ncy

Injected traffic

CongestionzoneWorking

zone

Network bandwidth is much higher than the bandwidth requested by end nodes

Low link utilization

Page 12: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

13

Why Congestion Management Now?

New problems arising:

System cost: Recent interconnects (Myrinet, InfiniBand, ASI) are

expensive compared to processors Power consumption: As network size increases, higher power

consumption, higher heat dissipation

Frequency/voltage scaling techniques: Not very efficient, and do not solve the system cost problem

Possible Solutions:

Reducing the number of network components: Possible by using a suitable topology, but link utilization increases

Systems will work closer to network saturation zone, thus,

a congestion management technique will be mandatory

Page 13: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

14

Why Current Techniques Are not Suitable?

Proactive Congestion Management (congestion prevention)• Path setup before data transmission• Used in ATM, computer networks (QoS)• High overhead, high latencies (not suitable for HPC)

The real problem is not the congestion, but its negative effects (HOL blocking)

Reactive Congestion Management (congestion recovery)• Injection limitation techniques using closed-loop feedback• Do not scale with network size and link bandwidth

– Notification delay (proportional to distance)

– Link capacity (proportional to clock frequency)

– May produce network instabilities

Page 14: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

15

Why Current Techniques Are not Suitable?

HOL blocking elimination/reduction• DAMQs and Virtual Channels

– not efficient for multihop networks

• VOQ (Virtual Output Queueing)– VOQ at switch level scales but does not eliminate HOL– VOQ at network level: A separate queue at every input port for every destination– Number of required resources scales at least quadratically with network

size !!!

• Credit Flow Controlled ATM– References congestion to network output only

– Consumes large number of buffers: A separate queue at every output port for every destination

Page 15: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

16

Proposal

Initial idea:• Exploit spatial and temporal locality in packet destinations• Manage the set of queues as a cache

– No equivalent to main memory!!! (where to replace?)– Not enough locality!!! (reduction in queue silicon area by a factor of 4)

Observation:• Non-congested flows do not introduce significant HOL blocking

RECN: Regional Explicit Congestion Notification• Non-congested flows are mapped to the same queue• Effective reduction in number of queues and no replacement needed• Congested flows are detected and mapped to set aside queues (SAQs)

RECN is a scalable congestion management technique because:• It reacts locally (and thus, it is not affected by propagation delays)• A very small number of queues (SAQs) for a wide range of network sizes

RECN enables:• Effective reduction of network cost by working closer to the saturation point• More efficient use of voltage/frequency scaling techniques

Page 16: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

17

RECN

Based on the PCI Express Advanced Switching Interconnect (ASI) specification

• Routing (turnpools)

Relevant switch architectural features• Congestion detection• Congestion notification and queue allocation• Queue deallocation• Packet processing• Flow control

Page 17: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

18

Turnpools

0

1

2

3

4

5

6

7

3

7

turn poolt. pointerD

Direction bit

Turn example

31 bits = 231 destinations

AS packet header

2 1

122 11

3

AA

33 22B

B

2

Allows to know if a packetwill pass through a givenport in the network

Mask bits required

Page 18: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

19

Switch Model

RAMin

XBARS=1.5

RAMin

RAMin

. . . . . .

RAMout

RAMout

RAMout

Arbiter

. . .

. . . . . .

Dynamic queuemanagement (VCs)

Dynamic queuemanagement (VCs)

LC

LC

LC

LC

LC

LC

Page 19: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

20

RAM in and RAM out

RAMSAQ 0SAQ 1

SAQ 2

SAQ 3

Cold Queue

SAQ 0SAQ 1

SAQ 2

SAQ 3

Tokens (one per each input port)

Root

Only at egress:Avoids successive

internal notifications

vv

vv

vv

vv

turn poolturn pool

turn poolturn pool

turn poolturn pool

turn poolturn pool

mask bitsmask bits

mask bitsmask bits

mask bitsmask bits

mask bitsmask bits

CAM

SAQ 0SAQ 1

SAQ 2

SAQ 3bb

bb

bb

bb

Valid bit Congested pointblocked

nextSAQnextSAQ

nextSAQnextSAQ

nextSAQnextSAQ

nextSAQnextSAQ

Xon/XoffFlow control

XoffXoff

XoffXoff

XoffXoff

XoffXoff

lvlv

lvlv

lvlv

lvlv

leave bit (only at ingress)

Page 20: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

21

A congestion point forms

How it Works

Page 21: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

22

How it Works

Cold queue fills over a threshold

Page 22: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

25

How it Works

Internal notification to each input port

sending packets to the output port

Page 23: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

28

How it Works

Input ports allocate a new SAQ for

packets addressed tothe congested output port

Page 24: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

33

How it Works

Notification sent whenthe SAQ fills

over a threshold

Page 25: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

36

How it Works

A new SAQ allocatedfor the congested port

at each output port

Page 26: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

37

How it Works

Internal notification when the SAQ fills over

A threshold

Page 27: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

38

How it Works

The input port allocatesA new SAQ

Page 28: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

39

How it Works

At the end, the congestion tree builds and is mapped

entirely onto SAQs

Page 29: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

40

Performance Evaluation

•Evaluation based on simulation results

•Two evaluation studies:• Network performance when using:

– RECN– VOQ at network level (VOQnet)– VOQ at switch level (VOQsw)– 4 queues at ingress and egress ports (4Q)– 1 queue at ingress and egress ports (1Q)

• RECN scalability

Page 30: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

41

Simulation Model

• Network configurations evaluated:• 64 hosts connected by a 64x64 BMIN• 256 hosts connected by a 256x256 BMIN• 512 hosts connected by a 512x512 BMIN

• Simulation assumptions:• BMINs based on shuffle-exchange connection scheme• Deterministic routing• 128 KB memories at ingress/egress ports• Multiplexed crossbar (BW=12 Gbps)• Serial full-duplex pipelined links (BW=8 Gbps)• 64 and 512-byte packets• Credit-based and Xon-Xoff (for SAQs) flow control• Maximum of 8 SAQs at ingress/egress ports (RECN)

Page 31: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

42

Traffic Load

•Synthetic Traffic:

•Traces:• From I/O activity at cello system disk interface• Different compression factors applied

# Srcs Dst.Injection Rate (%)

Traffic Start Time

Traffic End Time

Corner Case 1

75% Random 50% 0 Sim. End

25% Hot-Spot 100% 800 μs 970 μs

Corner Case 2

75% Random 100% 0 Sim. End

25% Hot-Spot 100% 800 μs 970 μs

Page 32: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

44

Performance Comparison

•Network throughput - Corner case 2, 64x64 BMIN

Page 33: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

45

Performance Comparison

• Network throughput – Traces, 64x64 BMIN

Compression Factor set to 20 Compression Factor set to 40

Page 34: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

46

Scalability Analysis

• SAQ utilization – Corner Case 1, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)

Total # of active SAQS

Page 35: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

47

Scalability Analysis

• SAQ utilization – Corner Case 2, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)

Total # of active SAQS

Page 36: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

48

Scalability Analysis

• SAQ utilization – Traces, Comp. Factor 20, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)

Total # of active SAQS

Page 37: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

49

Scalability Analysis

• SAQ utilization – Traces, Comp. Factor 40, 64x64 BMINMaximum # SAQs used (ingress) Maximum # SAQs used (egress)

Total # of active SAQS

Page 38: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

50

Scalability Analysis

• Network throughput – Corner Case 2, 256x256 BMIN

Maximum # SAQs used (egress)Maximum # SAQs used (ingress)

Page 39: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

51

Scalability Analysis

• Network throughput – Corner Case 2, 512x512 BMIN

Maximum # SAQs used (ingress) Maximum # SAQs used (egress)

Page 40: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

52

Final Remarks

• We also designed a protocol to deallocate SAQs when they are no longer needed

• Many optimizations– CAM IDs to reduce control message size– CAM search done in parallel with packet reception– Merging of congestion trees

• Silicon area reduced with respect to switch-level VOQs

Page 41: D2.2 Report on Movement in Switch State of the Artand Commercial Motivators (RECN) Annex A (June 7th 2006) Ian Johnson Xyratex

Tit

le:A

New

Sca

lab

le a

nd

Cost

-Eff

ect

ive C

on

gest

ion

Man

ag

em

en

t S

trate

gy f

or

Loss

less

Mu

ltis

tate

In

t.

Netw

ork

sC

on

fere

nce

: Th

e 1

1th

In

tern

ati

on

al S

ym

posi

um

on

Hog

h-P

erf

orm

an

ce C

om

pu

ter

Arc

hit

ect

ure

53

Conclusions

• We have proposed a scalable congestion management strategy for lossless networks

• We have shown that it only requires a small number of buffers for a wide range of network sizes

• We have modeled an existing ASI switch design, verifying:–Maintains network performance close to

ideal (but non-scalable) solution–Silicon area requirements are now smaller

than for the original design