Transcript
Page 1: The Crosspoint  Queued Switch

The Crosspoint Queued Switch

Yossi Kanizo (Technion, Israel)

Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)

Page 2: The Crosspoint  Queued Switch

Typical Switch Architectures

IQ – Input Queued

LinecardsSwitch Fabric Switch Fabric

CICQ – Combined Input and Crosspoint Queued

Linecards

Assumes Instantaneous Closed Loop

Page 3: The Crosspoint  Queued Switch

Single-Rack Router

Instantaneous closed loop → works in a single rack

Problem: multi-rack routers

Linecards

Switch Fabric

Page 4: The Crosspoint  Queued Switch

Current Router Architectures

Optical links

10sof meters

[Source: N. McKeown]

Is the closed loop still instantaneous?

Page 5: The Crosspoint  Queued Switch

1

10

100

1000

1998 2000 2002 2004 2006 2008 2010

Time-slot

RTT

Time Trendsn

s

Page 6: The Crosspoint  Queued Switch

Hiding Propagation Delays

Traditional solutions:Increase time-slot

poor switch performanceHide propagation delays using buffers

impractical amount of buffering

Proposed solution: closed loop → open loopPerformance degradation vs. instantaneous

closed loop

Page 7: The Crosspoint  Queued Switch

Outline

CQ: Open-loop switch architecture

Performance EvaluationAnalytical resultsSimulations CQ performance

degradation is not significant

Page 8: The Crosspoint  Queued Switch

Proposed Architecture:The Crosspoint-Queued (CQ) Switch No queues in

the linecards Buffering

only inside the fabric

Independent output schedulers

Drops with full buffers

Switch CoreLinecards

10s of meters

Page 9: The Crosspoint  Queued Switch

CQ Properties

Open loopNo communication overhead

No linecard queues No linecard queue management

“Router on a chip”Buffering and switch fabric

on same chip

Page 10: The Crosspoint  Queued Switch

Why not 10 years ago?

No need: single rackNo technology: SRAM density

Moore’s law: density doubling every 2.5 yearsAggressive 128x128 CQ switch: 4 cells of 64

bytes per crosspoint → 64 cells today

Conservative buffer requirements TCP Stanford model with smaller buffer needs

[Appenzeller, Keslassy and McKeown ’04]

Page 11: The Crosspoint  Queued Switch

Outline

CQ: Our open-loop switch architecture

Performance EvaluationAnalytical resultsSimulations

Page 12: The Crosspoint  Queued Switch

100% Throughput as B→

Throughput bounds:

OQ(2B-1) ≤ CQ(B)≤ OQ(NB)

…Buffer size B, LQF scheduling algorithm

100% Throughput

100% Throughput100%

Throughput

Page 13: The Crosspoint  Queued Switch

Uniform Traffic, B=1

Uniform traffic model:At each time-slot, at each

of the N inputs: Bernoulli IID packet arrivals with probability

Each packet is destined for one of the N outputs uniformly at random

Theorem: Under uniform traffic and B=1, the performance of the switch is independent of the specific work-conserving scheduling algorithm Intuition: Symmetry

/ N

/ N

/ N

/ N

/ N

Page 14: The Crosspoint  Queued Switch

Uniform Traffic, B=1

Theorem: The throughput and waiting time of a CQ switch, B=1 is:

Proof: Based on Z-transform

q=1-/N

Goes to 100% as N goes to infinity

Page 15: The Crosspoint  Queued Switch

Models for larger buffers

Approximate Performance Analysis

Model for exhaustive round-robin scheduling Based on modifications to polling

system with zero switch-over times

Model for random scheduling algorithm

Show 100% throughput as N→∞

1 / N

1 / N

1 / N

1 / N

1 / N

Page 16: The Crosspoint  Queued Switch

Trace-Driven Simulation

Buffers of size 64 suffice to ensure 99% throughput for N=32.

32x32 CQ switch with different buffer sizes (in units of 64-byte packets)

Page 17: The Crosspoint  Queued Switch

Conclusions

CQ is open loop → allows multi-rack configuration

CQ provides easy schedulingCQ is feasible to implement in a single

chipCQ shows good performance in

simulations

Page 18: The Crosspoint  Queued Switch

Thank You


Recommended