View
218
Download
0
Embed Size (px)
Citation preview
FAST Protocols for Ultrascale Networks
netlab.caltech.edu/FAST
Internet: distributed feedback control system TCP: adapts sending rate to congestion AQM: feeds back congestion information
Rf (s)
Rb’(s)
x
))((1
lll
l ctyc
p
)()(1)( tan)(
)()(1-2
tqtttT
wx iid
tqtxi
ii ii
ii
y
pq
TCP AQM
Theory
Calren2/Abilene
Chicago
Amsterdam
CERN
Geneva
SURFNet
StarLight
WAN in LabCaltech
research & production networks
Multi-Gbps50-200ms delay
Experiment
155Mb/s
slowstart
equilibrium
FASTrecovery
FASTretransmit
timeout
10Gb/s
Implementation
Students Choe (Postech/CIT) Hu (Williams) J. Wang (CDS) Z.Wang (UCLA) Wei (CS)
Industry Doraiswami (Cisco) Yip (Cisco)
Faculty Doyle (CDS,EE,BE) Low (CS,EE) Newman (Physics) Paganini (UCLA)
Staff/Postdoc Bunn (CACR) Jin (CS) Ravot (Physics) Singh (CACR)
Partners CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco
People
netlab.caltech.edu
FAST project
Protocols for ultrascale networks >100 Gbps throughput, 50-200ms delay Theory, algorithms, design, implement, demo, deployment
Faculty Doyle (CDS, EE, BE): complex systems theory Low (CS, EE): PI, networking Newman (Physics): application, deployment Paganini (EE, UCLA): control theory
Research staff 3 postdocs, 3 engineers, 8 students
Collaboration Cisco, Internet2/Abilene, CERN, DataTAG (EU), …
Funding NSF, DoE, Lee Center (AFOSR, ARO, Cisco)
netlab.caltech.edu
Outline
Motivation Theory
Web layout Content distribution TCP/AQM (Jin, poster)
TCP/IP (poster)
Enforcing & inducing fairness (poster)
Optical switching (future)
netlab.caltech.edu
High Energy Physics Large global collaborations
2000 physicists from 150 institutions in >30 countries 300-400 physicists in US from >30 universities & labs
SLAC has 500TB data by 4/2002, world’s largest database Typical file transfer ~1 TB
At 622Mbps: ~ 4 hrs At 2.5Gbps: ~ 1 hr At 10Gbps: ~15min Gigantic elephants!
LHC (Large Hadron Collider) at CERN, to open 2007 Generate data at PB (1015B)/sec Filtered in realtime by a factor of 106 to 107
Data stored at CERN at 100MB/sec Many PB of data per year To rise to Exabytes (1018B) in a decade
netlab.caltech.edu
HEP Network (DataTAG)
NLNLSURFnet
GENEVA
UKUKSuperJANET4
ABILENE
ABILENE
ESNETESNET
CALREN
CALREN
ItItGARR-B
GEANT
NewYork
FrFrRenater
STAR-TAP
STARLIGHT
Wave
Triangle
2.5 Gbps Wavelength Triangle 2002 10 Gbps Triangle in 2003
Newman (Caltech)
netlab.caltech.edu
Projected performance
Ns-2: capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps100 sources, 100 ms round trip propagation delay
’01155
’02622
’032.5
’04 5
’05 10
J. Wang (Caltech)
netlab.caltech.edu
Projected performance
Ns-2: capacity = 10Gbps100 sources, 100 ms round trip propagation delay
FAST TCP/RED
J. Wang (Caltech)
netlab.caltech.edu
Outline
Motivation Theory
Web layout Content distribution TCP/AQM (Jin, poster)
TCP/IP (poster)
Enforcing & inducing fairness (poster)
Optical switching (future)
netlab.caltech.edu
Protocol Decomposition
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …Power control Maximize channel
capacity
Shortest-path routing Minimize path costs
Duality model Maximize aggregate
utility
HOT (Doyle et al)
Minimize user response time
Heavy-tailed file sizes
netlab.caltech.edu
Congestion control
xi(t)
pl(t)
Example congestion measure pl(t) Loss (Reno) Queueing delay (Vegas)
netlab.caltech.edu
TCP/AQM
Congestion control is a distributed asynchronous algorithm to share bandwidth
It has two components TCP: adapts sending rate (window) to congestion AQM: adjusts & feeds back congestion information
They form a distributed feedback control system Equilibrium & stability depends on both TCP and AQM And on delay, capacity, routing, #connections
pl(t)
xi(t)TCP: Reno Vegas
AQM: DropTail RED REM/PI AVQ
netlab.caltech.edu
Network model
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lieR lis
lif link uses source if
lieR lislib link uses source if O
netlab.caltech.edu
for every RTT
{ if W/RTTmin – W/RTT < then W ++
if W/RTTmin – W/RTT > then W -- }
queue size
Vegas model
iiiii
i dtqtxtT
x )()( if )(
12
else 0ix
Fi:
iiiii
i dtqtxtT
x )()( if )(
12
Gl:))((1
llcl ctypl
Link queueing delay
E2E queueing delay
netlab.caltech.edu
Vegas model
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
1)(
l
ll c
tyG
ii
ii
dtqtx
i tTF
)()(
21sgn
)(
1
netlab.caltech.edu
Methodology
Protocol (Reno, Vegas, RED, REM/PI…)
Equilibrium Performance
Throughput, loss, delay
Fairness Utility
Dynamics Local stability Cost of stabilization
))( ),(( )1(
))( ),(( )1(
txtpGtp
txtpFtx
netlab.caltech.edu
Summary: duality model
cRx
xUs
ssxs
subject to
)( max0
Flow control problem
TCP/AQM Maximize utility with different utility functions
Primal-dual algorithm
))( ),(( )1(
))( ),(( )1(
txtpGtp
txtpFtx
Reno,
VegasDropTail, RED, REM
Theorem (Low 00): (x*,p*) primal-dual optimal iff 0 ifequality with ** lll pcy
netlab.caltech.edu
Equilibrium of VegasNetwork
Link queueing delays: pl
Queue length: clpl
Sources
Throughput: xi
E2E queueing delay : qi
Packets buffered:
Utility funtion: Ui(x) = i di log x Proportional fairness
iiii dqx
netlab.caltech.edu
Persistent congestion
Vegas exploits buffer process to compute prices (queueing delays)
Persistent congestion due to Coupling of buffer & price Error in propagation delay estimation
Consequences Excessive backlog Unfairness to older sources
Theorem (Low, Peterson, Wang ’02)
A relative error of i in propagation delay estimation distorts the utility function to
iiiiiiiii xdxdxU log)1()(ˆ
netlab.caltech.edu
Validation (L. Wang, Princeton)
Source rates (pkts/ms)# src1 src2 src3 src4 src51 5.98 (6) 2 2.05 (2) 3.92 (4)3 0.96 (0.94) 1.46 (1.49) 3.54 (3.57)4 0.51 (0.50) 0.72 (0.73) 1.34 (1.35) 3.38 (3.39)5 0.29 (0.29) 0.40 (0.40) 0.68 (0.67) 1.30 (1.30) 3.28
(3.34)
# queue (pkts) baseRTT (ms)1 19.8 (20) 10.18 (10.18)2 59.0 (60) 13.36 (13.51)3 127.3 (127) 20.17 (20.28)4 237.5 (238) 31.50 (31.50)5 416.3 (416) 49.86 (49.80)
netlab.caltech.edu
Methodology
Protocol (Reno, Vegas, RED, REM/PI…)
Equilibrium Performance
Throughput, loss, delay
Fairness Utility
Dynamics Local stability Cost of stabilization
))( ),(( )1(
))( ),(( )1(
txtpGtp
txtpFtx
netlab.caltech.edu
TCP/RED stability
Small effect on queue AIMD Mice traffic Heterogeneity
Big effect on queue Stability!
netlab.caltech.edu
Stable: 20ms delay
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
10
20
30
40
50
60
70Window
time (ms)
Win
dow
(pk
ts)
individual window
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
100
200
300
400
500
600
700
800Instantaneous queue
time (ms)
Inst
anta
neou
s qu
eue
(pkt
s)
Queue
Stable: 20ms delay
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
10
20
30
40
50
60
70Window
time (ms)
Win
dow
(pk
ts)
individual window
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
10
20
30
40
50
60
70Window
time (ms)
Win
dow
(pk
ts)
individual window
average window
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
10
20
30
40
50
60
70Window
time (10ms)
Win
dow
(pk
ts)
individual window
Unstable: 200ms delay
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
10
20
30
40
50
60
70Window
time (10ms)
Win
dow
(pk
ts)
individual window
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
10
20
30
40
50
60
70Window
time (10ms)
Win
dow
(pk
ts)
individual window
average window
Unstable: 200ms delay
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
100
200
300
400
500
600
700
800Instantaneous queue
time (10ms)
Inst
anta
neou
s qu
eue
(pkt
s)
QueueWindow
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
Other effects on queue
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
100
200
300
400
500
600
700
800Instantaneous queue
time (ms)
Inst
anta
neou
s qu
eue
(pkt
s)
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
100
200
300
400
500
600
700
800Instantaneous queue
time (10ms)
Inst
anta
neou
s qu
eue
(pkt
s)
20ms
200ms
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
700
800Instantaneous queue (50% noise)
time (sec)
inst
anta
neou
s qu
eue
(pkt
s)
30% noise
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
700
800Instantaneous queue (50% noise)
time (sec)
inst
anta
neou
s qu
eue
(pkt
s)
30% noise
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
700
800
time (sec)
Instantaneous queue (pkts)
inst
anta
neou
s qu
eue
(pkt
s)
avg delay 16ms
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
700
800
time (sec)
Instantaneous queue (pkts)
inst
anta
neou
s qu
eue
(pkt
s)
avg delay 208ms
netlab.caltech.edu
222
2
3
33
)1(4
)1 )(
2
-(Nc
N
c
Theorem (Low et al, Infocom’02) Reno/RED is stable if
Stability: Reno/RED
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
TCP: Small Small c Large N
RED: Small Large delay
netlab.caltech.edu
Stability: scalable control
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)()(
)(tq
mii
iii
i
extx
Theorem (Paganini, Doyle, Low, CDC’01) Provided R is full rank, feedback loop is locally stable for arbitrary delay, capacity, load and topology
netlab.caltech.edu
Stability: Vegas
ii
ii
dtqtx
i tTx
)()(
21sgn
)(
1
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)(
Theorem (Choe & Low, Infocom’03) Provided R is full rank, feedback loop is locally stable if
), ;( max 20 kMTx ii
netlab.caltech.edu
Stability: Stabilized Vegas
)()(1)( tan)(
1 )()(1-
2tqtt
tTx iid
tqtxi ii
ii
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)(
Theorem (Choe & Low, Infocom’03) Provided R is full rank, feedback loop is locally stable if
),( max aTx ii
netlab.caltech.edu
Stability: Stabilized Vegas
)()(1)( tan)(
1 )()(1-
2tqtt
tTx iid
tqtxi ii
ii
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)(
Application Stabilized TCP with current routers Queueing delay as congestion measure has right scaling Incremental deployment with ECN
netlab.caltech.edu
Outline
Motivation Theory
Web layout Content distribution TCP/AQM (Jin, poster)
TCP/IP (poster)
Enforcing & inducing fairness (poster)
Optical switching (future)
netlab.caltech.edu
Protocol Decomposition
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …Power control Maximize channel
capacity
Shortest-path routing Minimize path costs
Duality model Maximize aggregate
utility
HOT (Doyle et al)
Minimize user response time
Heavy-tailed file sizes
netlab.caltech.edu
Network model
F1
FN
G1
GL
R
RT
TCP Network AQM
x y
q p
))( ),(( )1(
))( ),(( )1(
tRxtpGtp
txtpRFtx T
Reno, Vegas
DT, RED, …
liRli link uses source if 1 IP routing
netlab.caltech.edu
Duality model of TCP/AQM
cRx
xUi
iix
subject to
)( max0
Flow control problem
TCP/AQM Maximize utility with different utility functions
Primal-dual algorithm
))( ),(( )1(
))( ),(( )1(
tRxtpGtp
txtpRFtx T
Reno,
VegasDT, RED, REM/PI, AVQ
Theorem (Low 00): (x*,p*) primal-dual optimal iff 0 ifequality with ** lll pcy
netlab.caltech.edu
Motivation
ll
li l
lliR
iiixp
iii
xR
cppRxxU
cRxxU
ii
max)( max min
subject to )( maxmax
00
0
:Dual
:Primal
netlab.caltech.edu
Motivation
Can TCP/IP maximize utility?
ll
li l
lliR
iiixp
iii
xR
cppRxxU
cRxxU
ii
max)( max min
subject to )( maxmax
00
0
:Dual
:Primal
Shortest path routing!
netlab.caltech.edu
TCP-AQM/IP
Theorem (Wang et al, Infocom’03)
Primal problem is NP-hard
Ai
iAi
i cc
Proof Reduce integer partition to primal problem
Given: integers {c1, …, cn}Find: set A s.t.
netlab.caltech.edu
TCP-AQM/IP
Achievable utility of TCP/IP?
Stability? Duality gap?
Conclusion: Inevitable tradeoff between
achievable utility routing stability
Theorem (Wang et al, Infocom’03)
Primal problem is NP-hard
netlab.caltech.edu
General network
Conclusion: Inevitable tradeoff between
achievable utility routing stability
random graph20 nodes, 200 links Achievable
utility
FAST Protocols for Ultrascale Networks
netlab.caltech.edu/FAST
Internet: distributed feedback control system TCP: adapts sending rate to congestion AQM: feeds back congestion information
Rf (s)
Rb’(s)
x
))((1
lll
l ctyc
p
)()(1)( tan)(
)()(1-2
tqtttT
wx iid
tqtxi
ii ii
ii
y
pq
TCP AQM
Theory
Calren2/Abilene
Chicago
Amsterdam
CERN
Geneva
SURFNet
StarLight
WAN in LabCaltech
research & production networks
Multi-Gbps50-200ms delay
Experiment
155Mb/s
slowstart
equilibrium
FASTrecovery
FASTretransmit
timeout
10Gb/s
Implementation
Students Choe (Postech/CIT) Hu (Williams) J. Wang (CDS) Z.Wang (UCLA) Wei (CS)
Industry Doraiswami (Cisco) Yip (Cisco)
Faculty Doyle (CDS,EE,BE) Low (CS,EE) Newman (Physics) Paganini (UCLA)
Staff/Postdoc Bunn (CACR) Jin (CS) Ravot (Physics) Singh (CACR)
Partners CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco
People