Upload
kristina-poole
View
224
Download
1
Tags:
Embed Size (px)
Citation preview
Nick McKeown
CS244 Lecture 7
Valiant Load Balancing
Simple Model of US Backbone
2
Designing a Backbone Network
3
1. Hard to measure current traffic matrix.- Harder still to estimate future traffic matrices.
1. Hard to know which traffic matrices can be supported.- Harder still under link and node failures.
The Problem
4
1 2
3N
… 4
r1
r4
r3
r2
POPs in big cities
Q: How capacity to provision between two POPs?
The Problem
5
1 2
3N
… 4
r1
r4
r3
r2
6
In
In
In
Out
Out
Out
r
r
r
r
r
r
Router capacity = NrSwitch capacity = N2r
100% Throughput in a Mesh
?
?
?
?
?
?
?
?
?
r
r
r
r
r
r
r
r
r
rrrr
Questions
How would we provision the links if we know the traffic matrix?
What is the cost of not knowing the traffic matrix?
7
Valiant Load Balancing
8
1 2
3N
… 4
r
r
r
r
2r/N
9
Outline for Today
1. Basic idea of load-balancing
2. Packet mis-sequencing
3. An optical switch fabric
10
R
In
In
In
Out
Out
Out
R
R
R
R
R
R/N
R/N
R/N
R/NR/N
R/N
R/N
R/N
R/N
If Traffic Is Uniform
RNR /NR /NR /
R
NR / NR /
11
Real Traffic is Not Uniform
R
In
In
In
Out
Out
Out
R
R
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
RNR /NR /NR /
R
RNR /NR /NR /
R
RNR /NR /NR /
R
R
R
R
?
Can we make traffic “sufficiently uniform” to make the problem trivial?
12
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
VLB Switch
Load-balancing stage Forwarding stage
In
In
In
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
100% throughput for weakly mixing traffic (Valiant, C.-S. Chang)
13
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
112233
VLB Switch
14
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N33
22
11
VLB Switch
15
Out
Out
Out
R
R
R
In
In
In
R
R
R
R/N
R/N
R/N
Intuition: 100% Throughput
Arrivals to second mesh:
Capacity of second mesh:
Second mesh: arrival rate < service rate
01
-b RUaUN
C
UN
RC
Cba
[C.-S. Chang]
R/N
R/N
R/N
16
Another way of thinking about it
1
N
1
N
1
N
External Outputs
Internal Inputs
External Inputs
Load-balancing cyclic shift
Switching cyclic shift
Interesting properties:• 100% throughput, no arbiter (but 2x switching capacity)• No part of the system need operate faster than the line rate
Performance
1. What are the performance tradeoffs between a scheduler and a load-balanced design?
2. How can a load-balanced switch have lower loss than an OQ switch?
3. “I’m surprised that no one came up with the idea earlier”
17
What you said
My favorite line in the paper is the following: "If it is possible to build a packet switch with 100% throughput that has no scheduler, no reconfigurable switch fabric, and buffer memories operating without speedup, where does the packet switching actually take place?" The answer of course in the VOQs…”
18
19
Outline
1. Basic idea of load-balancing
2. Packet mis-sequencing
3. An optical switch fabric
What you said
“[I]f packet mis-sequencing is such a performance hit due to the way TCP is designed and how the Internet has reached to a point where a new transport protocol adoption is impractical, I wonder how long we will all have to live with TCP before Internet traffic reaches a point where a whole new layering system must be re-architected (and what might David Clark might have to say about that).”
20
Packet Mis-sequencing
1. Does the Internet allow packets to be mis-sequenced?
2. Why do we (or network operators) care?
3. Will the Internet require packets to stay in sequence in the future?
21
22
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
Packet Reordering
12
23
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
Bounding Delay Difference Between Middle Ports
1
2
cells
24
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
123
0
Uniform Frame Spreading
12
25
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
FOFF (Full Ordered Frames First)
12
26
FOFF (Full Ordered Frames First)
Input Algorithm N FIFO queues corresponding to the N output flows Spread each flow uniformly: if last packet was sent to
middle port k, send next to k+1. Every N time-slots, pick a flow:
- If full frame exists, pick it and spread like UFS - Else if all frames are partial, pick one in round-robin order and send it
123
12
4
N
27
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
Bounding Reordering
123
NN
28
FOFF
Output properties N FIFO queues corresponding to the N middle
ports Buffer size less than N2 packets If there are N2 packets, one of the head-of-line
packets is in order
111
22
333
Output
4
N
29
VLB + FOFF Properties
With quite a lot of work, packet order is maintained
Interestingly, expected packet delay is within a constant of OQ switch (surprising)
Therefore, VLB with FOFF has 100% throughput
30
Outline
1. Basic idea of load-balancing
2. Packet mis-sequencing
3. An optical switch fabric
What you said
"They state that their theoretical 100 Tb/s switch should be able to be built in about 3 years, so if they were correct it should have long since have been built. I was unable to find anything about 100 Tb/s optics switches being used, so I’m not sure if it happened or not. There was another paper on the subject in 2010, suggesting that it took more than 3 years for technology to advance sufficiently. If such a switch has been manufactured, did it perform to expectations? If not, what prevented it?”
31
32
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
In
In
In
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
From Two Meshes to One Mesh
One linecard
In
Out
33
From Two Meshes to One Mesh
First meshIn Out
In Out
In Out
In Out
One linecard
Second mesh
R R
R
R
R
34
From Two Meshes to One Mesh
Combined meshIn Out
In Out
In Out
In Out
2RR
2R
2R
2R
35
Many Fabric Options
Options
Space: Full uniform meshTime: Round-robin crossbarWavelength: Static WDM
Any spreadingdevice
C1, C2, …, CN
C1
C2
C3
CN
In Out
In Out
In Out
In Out
N channels each at rate 2R/NOne linecard
36
AWGR (Arrayed Waveguide Grating Router) A Passive Optical Component
Wavelength i on input port j goes to output port (i+j-1) mod N
Can shuffle information from different inputs
1,
2…N
NxN AWGR
Linecard 1
Linecard 2
Linecard N
1
2
N
Linecard 1
Linecard 2
Linecard N
37
In Out
In Out
In Out
In Out
Static WDM Switching: Packaging
AWGR
Passive andAlmost Zero
Power
A
B
C
D
A, B, C, D
A, B, C, D
A, B, C, D
A, B, C, D
A, A, A, A
B, B, B, B
C, C, C, C
D, D, D, D
N WDM channels, each at rate 2R/N
Linecard placement and failureWhat happens if a linecard is missing or fails?
Does this happen in practice?
38