24
On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

On Horrible TCP Performanceover Underwater Links

High PerformanceSwitching and RoutingTe le c o m C e n te r W o rksh o p : S ep t 4 , 1 9 9 7 .

Balaji PrabhakarAbdul Kabbani, Balaji Prabhakar

Stanford University

Page 2: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

In Defense of TCP

High PerformanceSwitching and RoutingTe le c o m C e n te r W o rksh o p : S ep t 4 , 1 9 9 7 .

Balaji PrabhakarAbdul Kabbani, Balaji Prabhakar

Stanford University

Page 3: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

3

Overview

• TCP has done well for the past 20 yrs– It is continuing to do well

• However, some performance problems have been identified– TCP needs long buffers to keep links utilized – It doesn’t perform well in large BWxDelay links

• It is oscillatory, and it is sluggish– TCP takes too long to process short flows• Note: we’re not addressing TCP over wireless

• We revisit some of the issues and find– Either we have demanded too much– Or, there are very close relatives to TCP that satisfy our demands

Page 4: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

4

Some background

• I’ve recently become familiar with congestion control in two LAN networks– Fibre Channel (for Storage Area Networks)– Ethernet (part of a standardization effort)

• These networks are much more severe than the Internet in terms of the operating condition, and yet they function quite alright!

• Let’s look at them briefly

Page 5: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

5

SAN: Fibre Channel

• Fibre Channel: A standardized protocol for SANs; main features– Packet switching– Typical topology: 3-5 hop networks– No packet drops! Buffer-to-buffer credits used to transport data– No end-to-end congestion control!

• Very wide deployment– The dominant technology for host-to-storage array data transfers

• No congestion-related or congestion collapse problems reported to date!

• Upon investigation, here’re some factors helping FC networks– Small file sizes (128KB, chopped as 64 2KB pkts, comparable to

buffer size at the switches)– Light loads (30-40%) – Small topologies

Page 6: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Data Center Ethernet

• Involved in developing a congestion management algorithm in the IEEE 802.1 Data Center Bridging standards activity, for Data Center Ethernet – With Berk Atikoglu, Abdul Kabbani, Rong Pan and Mick Seaman

• Ethernet vs Internet (not an exhaustive list)1. There is no end-to-end signaling in the Ethernet a la per-packet acks in the

Internet• So congestion must be signaled to the source by switches• Not possible to know round trip time!• Algorithm not automatically self-clocked (like TCP)

• Links can be paused; i.e. packets may not be dropped• No sequence numbering of L2 packets• Sources do not start transmission gently (like TCP slow-start); they can

potentially come on at the full line rate of 10Gbps• Ethernet switch buffers are much smaller than router buffers (100s of KBs vs

100s of MBs)• Most importantly, algorithm should be simple enough to be implemented

completely in hardware

Page 7: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Summary

• We see a “hierarchy of harsh operating environments”– Low BWxDelay Internet, High BWxDelay Internet, Ethernet, Wireless, etc…

• Based on experience with Ethernet and Fibre Channel, I’m convinced that– TCP is operating in an environment where sufficient information and flexibility

exists to obtain good performance; with minimal changes

• In the next part, we will illustrate the above claim– Since we’re not allowed to mention any schemes out there, we thought we’d

invent a new one!

• Will consider high BWxDelay networks– With short buffers (10-20% of BWxDelay)– Small number of sources (e.g. 1 source)– Assume: ECN marking– Show that utilization can be as high as 100%– Short flows need not suffer large transfer times

Page 8: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

8

Single Link Topology

• Recall– A single TCP source needs BWxDelay amount of buffering to run at

the line rate– With shorter buffers TCP loses tpt dramatically– This hurts TCP in very large BWxDelay networks

• We consider a single long link to begin with– BWxDelay = 1000 pkts– Buffer used = 200 pkts– Marking probability

A B400Mbps

30msec RTT

Queue occupancy

Sampling probability

25% 50% 100%

1%

100%

Page 9: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

TCP tpt with short buffers

78%

Page 10: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

TCP tpt with short buffers:Varying number of sources

Page 11: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Improvement: Take One

• Clearly cutting the window by a factor of 2 is harmful– The source takes a long time to build its window back up

• So, let’s consider a “multibit TCP” which allows us to cut the window by smaller factors

• cwnd <-- cwnd(1- ECN/2n+1)– E.g. with 6-bit TCP, smallest cut is by 127/128

Queue occupancy

Sampling probability

25% 50% 100%

1%

100%

Queue

occupancy

Mark value

25% 50% 100%

1

2n -1

Page 12: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Single-link: Window Size

Page 13: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Single-link: Queue Occupancy

1 source 2 sources

Page 14: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

14

Multiple Link Topology

• Want to see if improvements persist as we go to larger networks

• Parking lot topology

0 1 2 3400Mbps

30msec RTT

R1 R2 R3

R4

R5

Page 15: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Parking Lot Utilization: Single-hop Flows

0 1 2 3400Mbp

s

30msec RTT

R1 R2 R3

R4

R5

R1

R2 R3

Page 16: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Parking Lot Utilization: Multi-hop Flows

R5: two-hop

R4: three-hop

Page 17: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Summary

• In both the single link and the multiple link topology– Ensuring that TCP doesn’t always cut its window by 2 is a

good idea

• Can we have this happen without using multiple bits?

Page 18: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Adaptive 1-bit TCP Source

• We came up with this over the weekend, so it really is part of this talk and not “our favorite algorithm”

• Source maintains an “average congestion seen” value AVE

• Updating AVE: simple exponential averaging– AVE <-- (AVE + ECN)/2– Note: AVE is between 0 and 1

• Using AVE: – cwnd <-- cwnd / (1 + AVE)– Decrease factor is between 1 and 2

Page 19: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Single-link: Window Size

100%

Page 20: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Single-link: Queue Occupancy

1 source 2 sources

Page 21: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

Single-link: Utilization

Page 22: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

22

Improving the transfer time for short flows

• Ran out of time for this one

• But, if we just have a starting window size of 10 pkts as opposed to 1 pkt, most short flows will complete during 1 RTT of the slow start phase

Page 23: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

23

Conclusion

• The wide area Internet is quite a friendly environment compared to Ethernet, Fibre Channel and, certainly, Wireless

• Simple fixes exist (and are well-known) for high BWxDelay networks– Relationship with buffer size useful to understand

– But, short buffers quite adequate

– Fake watermark on buffer (e.g. at 50% of buffer size) helps reduce packet drops drastical

• Using fake watermark on router buffers enables using smaller buffers and reducing bursty drops

Page 24: On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

24

Long live TCP!(with facelifts)