UDT

Preview:

DESCRIPTION

 

Citation preview

1 :: 50

udt.sourceforge.net

BREAKING THE DATA TRANSFER BOTTLENECK

Yunhong GU

gu@lac.uic.edu

National Center for Data Mining

University of Illinois at Chicago

October 10, 2005

Updated on August 8, 2009

udt.sourceforge.net

UDT: A High Performance Data Transport Protocol

2 :: 50

udt.sourceforge.net

Outline

INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

3 :: 50

udt.sourceforge.net

>> INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

4 :: 50

udt.sourceforge.net

Motivations

The widespread use of high-speed networks (1Gb/s, 10Gb/s, etc.) has enabled many new distributed data intensive applications Inexpensive fibers and advanced optical networking technologies (e.g.,

DWDM - Dense Wavelength Division Multiplexing) 10Gb/s is common in high speed network testbeds, 40 Gb/s is emerging

Large volumetric datasets Satellite weather data Astronomy observation Network monitoring

The Internet transport protocol (TCP) does NOT scale well as network bandwidth-delay product (BDP) increases

New transport protocol is needed!

5 :: 50

udt.sourceforge.net

Data Transport Protocol

Functionalities Streaming, messaging Reliability Timeliness Unicast vs. multicast

Congestion control Efficiency Fairness Convergence Distributedness

Physical Layer

Applications

Data link Layer

Network Layer

Transport Layer

6 :: 50

udt.sourceforge.net

TCP

Reliable, data streaming, unicast

Congestion control Increase congestion window size (cwnd) one full sized packet per RTT Halve the cwnd per loss event

Poor efficiency in high bandwidth-delay product networks

Bias on flows with larger RTT

½ Bandwidth * RTT

7 :: 50

udt.sourceforge.net

TCP

0.01%

0.05%

0.1%

0.1%

0.5%

1000

800

600

400

200

1 10 100 200 400

1000

800

600

400

200

Thr

ough

put (

Mb/

s)

Thr

ough

put (

Mb/

s)

Pack

et Los

s

Round Trip Time (ms)

LAN US-EU US-ASIAUS

8 :: 50

udt.sourceforge.net

Related Work

TCP variants HighSpeed, Scalable, BiC, FAST, H-TCP, L-TCP

Parallel TCP PSockets, GridFTP

Rate-based reliable UDP RBUDP, Tsunami, FOBS, FRTP (based on SABUL), Hurricane (based on

UDT)

XCP

SABUL

9 :: 50

udt.sourceforge.net

Problems of Existing Work

Hard to deploy TCP variants and XCP Need modifications in OS kernel and/or routers

Cannot be used in shared networks Most reliable UDP-based protocols

Poor fairness Intra-protocol fairness RTT fairness

Manual parameter tuning

10 :: 50

udt.sourceforge.net

A New Protocol

0.01%

0.05%

0.1%

0.1%

0.5%

1000

800

600

400

200

1 10 100 200 400

1000

800

600

400

200

Thr

ough

put (

Mb/

s)

Thr

ough

put (

Mb/

s)

Pack

et Los

s

Round Trip Time (ms)

LAN US-EU US-ASIAUS

11 :: 50

udt.sourceforge.net

UDT (UDP-based Data Transfer Protocol)

Application level, UDP-based

Similar functionalities to TCP Connection-oriented reliable duplex unicast data streaming

New protocol design and implementation

New congestion control algorithm

Configurable congestion control framework

12 :: 50

udt.sourceforge.net

Objective & Non-objective

Objective For distributed data intensive applications in high speed networks A small number of flows share the abundant bandwidth Efficient, fair, and friendly Configurable Easily deployable and usable

Non-objective Replace TCP on the Internet

13 :: 50

udt.sourceforge.net

UDT Project

Open source (udt.sourceforge.net)

Design and implement the UDT protocol

Design the UDT congestion control algorithm

Evaluate experimentally the performance of UDT

Design and implement a configurable protocol framework based on UDT (Composable UDT)

14 :: 50

udt.sourceforge.net

>> PROTOCOL DESIGN & IMPLEMENTATION

INTRODUCTION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

15 :: 50

udt.sourceforge.net

UDT Overview

Two orthogonal elements The UDT protocol The UDT congestion control algorithm

Protocol design & implementation Functionality Efficiency

Congestion control algorithm Efficiency, fairness, friendliness, and stability

16 :: 50

udt.sourceforge.net

UDP

Socket API

Applications

UDT Overview

TCP

Socket API

Applications

Applications

UDT

UDT Socket

17 :: 50

udt.sourceforge.net

Functionality

Reliability Packet-based sequencing Acknowledgment and loss report from receiver ACK sub-sequencing Retransmission (based on loss report and timeout)

Streaming and Messaging Buffer/memory management

Connection maintenance Handshake, keep-alive message, teardown message

Duplex Each UDT instance contains both a sender and a receiver

18 :: 50

udt.sourceforge.net

Protocol Architecture

Sender

Receiver

Sender

Receiver

UDP Channel

UDPSeq. No TS Payload

ACK Seq. No

NAK Loss List

Sender

A B

19 :: 50

udt.sourceforge.net

Software Architecture

AP

I

UD

P C

hannel

Sender

Receiver

Sender'sBuffer

Receiver'sBuffer

CC

Sender'sLoss List

Receiver'sLoss List

Listener

20 :: 50

udt.sourceforge.net

Efficiency Consideration

Less packets Timer-based acknowledging

Less CPU time Reduce per packet processing time Reduce memory copy Reduce loss list processing time Light ACK vs. regular ACK

Parallel processing Threading architecture

Less burst in processing Evenly distribute the processing time

21 :: 50

udt.sourceforge.net

Application Programming Interface (API)

Socket API

New API sendfile/recvfile: efficient file transfer sendmsg/recvmsg: messaging with partial reliability selectEx: a more efficient version of “select”

Rendezvous Connect Firewall traversing

22 :: 50

udt.sourceforge.net

>> CONGESTION CONTROL

INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

23 :: 50

udt.sourceforge.net

Overview

Congestion control vs. flow control Congestion control: effectively utilize the network bandwidth Flow control: prevent the receiver from being overwhelmed by incoming

packets

Window-based vs. rate-based Window-based: tune the maximum number of on-flight packets (TCP) Rate-based: tune the inter-packet sending time (UDT)

AIMD: additive increases multiplicative decreases

Feedback Packet loss (Most TCP variants, UDT) Delay (Vegas, FAST)

24 :: 50

udt.sourceforge.net

AIMD with Decreasing Increases

AIMD x = x + (x), for every constant interval (e.g., RTT) x = (1 - ) x, when there is a packet loss event

where x is the packet sending rate.

TCP (x) 1, and the increase interval is RTT. = 0.5

AIMD with Decreasing Increase (x) is non-increasing, and limx->+ (x) = 0.

25 :: 50

udt.sourceforge.net

AIMD with Decreasing Increases

(x)

x

AIMD (TCP NewReno)

UDT

HighSpeed TCP

Scalable TCP

26 :: 50

udt.sourceforge.net

Increase (x) = f( B - x ) * c

where B is the link capacity (Bandwidth), c is a constant parameter

Constant rate control interval (SYN), irrelevant to RTT SYN = 0.01 seconds

Decrease Randomized decrease factor = 1 – (8/9)n

UDT Control Algorithm

cx xB )log(10)(

(x)

x

27 :: 50

udt.sourceforge.net

The Increase Formula: an Example

Bandwidth (B) = 10 Gbps, Packet size = 1500 bytes

x (Mbps) B - x (Mbps) Increment (pkts/SYN)

[0, 9000) (1000, 10000] 10

[9000, 9900) (100, 1000] 1

[9900, 9990) (10, 100] 0.1

[9990, 9999) (1, 10] 0.01

[9999, 9999.9) (0.1, 1] 0.001

9999.9+ <0.1 0.00067

28 :: 50

udt.sourceforge.net

Dealing with Packet Loss

Loss synchronization Randomization method

Non-congestion loss Do not decrease sending rate for the first packet loss

M=5, N=2 M=8, N=3

Packet reordering

29 :: 50

udt.sourceforge.net

Bandwidth Estimation

Packet Pair

Filters Cross traffic Interrupt Coalescence Robust to estimation errors

Randomized interval to send packet pair

P2 P1 P2 P1 P2 P1

Packet Size / Space Bottleneck Bandwidth

30 :: 50

udt.sourceforge.net

>> PERFORMANCE EVALUATION

INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

COMPOSABLE UDT

CONCLUSIONS

31 :: 50

udt.sourceforge.net

Performance Characteristics

Efficiency Higher bandwidth utilization, less CPU usage

Intra-protocol fairness Max-min fairness Jain's fairness index

TCP friendliness Bulk TCP flow vs Bulk UDT flow Short-lived TCP flow (slow start phase) vs Bulk UDT flow

Stability (oscillations) Stability index (standard deviation)

32 :: 50

udt.sourceforge.net

Evaluation Strategies

Simulations vs. experiments NS2 network simulator, NCDM teraflow testbed

Setup Network topology, bandwidth, distance, queuing, Link error rate, etc. Concurrency (number of parallel flows)

Comparison (against TCP)

Real world applications SDSS data transfer, high performance mining of streaming data, etc.

Independent evaluation SLAC, JGN2, UvA, Unipmn (Italy), etc.

33 :: 50

udt.sourceforge.net

Efficiency, Fairness, & Stability

Flow 1

Flow 2

Flow 3

Flow 4

0 100 200 300 400 500 600 700Time (sec)

206.220.241.16

206.220.241.15

206.220.241.14

206.220.241.13

145.146.98.81

145.146.98.80

145.146.98.79

145.146.98.781Gb/s bandwidth, 106 ms RTT,

StarLight, Chicago SARA, Amsterdam

34 :: 50

udt.sourceforge.net

Efficiency, Fairness, & Stability

0 100 200 300 400 500 600 7000

200

300

450

900

1000

Time (s)

Th

rou

gh

ou

t (M

bits

/s)

Flow 1 902 466 313 215 301 452 885

Flow 2 446 308 216 310 452

Flow 3 302 202 307

Flow 4 197

Efficiency 902 912 923 830 918 904 885

Fairness 1 0.999 0.999 0.998 0.999 1 1

Stability 0.11 0.11 0.08 0.16 0.04 0.02 0.04

35 :: 50

udt.sourceforge.net

TCP Friendliness

0 1 2 3 4 5 6 7 8 9 1020

30

40

50

60

70

80

Number of UDT flows

TC

P T

hro

ug

hp

ut

(Mb

/s)

500 1MB TCP flows vs. 0 – 10 bulk UDT flows 1Gb/s between Chicago and Amsterdam

36 :: 50

udt.sourceforge.net

>> COMPOSABLE UDT

INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

CONCLUSIONS

37 :: 50

udt.sourceforge.net

Composable UDT - Objectives

Easy implementation and deployment of new control algorithms

Easy evaluation of new control algorithms

Application awareness support and dynamic configuration

38 :: 50

udt.sourceforge.net

Composable UDT - Methodologies

Packet sending control Window-based, rate-based, and hybrid

Control event handling onACK, onLoss, onTimeout, onPktSent, onPktRecved, etc.

Protocol parameters access RTT, loss rate, RTO, etc.

Packet extension User-defined control packets

39 :: 50

udt.sourceforge.net

Composable UDT - Evaluation

Simplicity Can it be easily used?

Expressiveness Can it be used to implement most control protocols?

Similarity Can Composable UDT based implementations reproduce the

performance of their native implementations?

Overhead Will the overhead added by Composable UDT be too large?

40 :: 50

udt.sourceforge.net

Simplicity & Expressiveness

Eight event handlers, four protocol control functions, and one performance monitoring function.

Support a large variety of protocols Reliable UDT blast TCP and its variants (both loss and delay based) Group transport protocols

41 :: 50

udt.sourceforge.net

Simplicity & Expressiveness

CCCBase Congestion

Control Class

CTCPTCP NewReno

CGTPGroup Transport

Protocol

CUDPBlastReliable UDP

Blast

CFASTFAST TCP

CVegasTCP Vegas

CScalableScalable TCP

CHSHighSpeed TCP

CBiCBiC TCP

CWestwoodTCP Westwood

28

73 / +132-6 11 / +192-29 8 / +27-1 11 / +192-29 27 / +145-2

37 / +351-2

42 :: 50

udt.sourceforge.net

Similarity and Overhead

Similarity How Composable UDT based implementations can simulate their native

implementations

CTCP vs. Linux TCP

Flow#

Throughput Fairness Stability

TCP CTCP TCP CTCP TCP CTCP

1 112 122 1 1 0.517 0.415

2 191 208 0.997 0.999 0.476 0.426

4 322 323 0.949 0.999 0.484 0.492

8 378 422 0.971 0.999 0.633 0.550

16 672 642 0.958 0.985 0.502 0.482

32 877 799 0.988 0.997 0.491 0.470

64 921 716 0.994 0.996 0.569 0.529

CPU usage Sender: CTCP uses about 100% more times of CPU as Linux TCP Receiver: CTCP uses about 20% more CPU than Linux TCP

43 :: 50

udt.sourceforge.net

>> CONCLUSIONS

INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

44 :: 50

udt.sourceforge.net

Contributions

A high performance data transport protocol and associated implementation The UDT protocol Open source UDT library (udt.sourceforge.net) User includes research institutes and industries

An efficient and fair congestion control algorithm DAIMD & the UDT control algorithm Packet loss handling techniques Using bandwidth estimation technique in congestion control

A configurable transport protocol framework Composable UDT

45 :: 50

udt.sourceforge.net

Selected Publications

Papers on the UDT Protocol UDT: UDP-based Data Transfer for High-Speed Wide Area Networks,

Yunhong Gu and Robert L. Grossman, Computer Networks (Elsevier). Volume 51, Issue 7. May 2007.

Supporting Configurable Congestion Control in Data Transport Services, Yunhong Gu and Robert L. Grossman, SC 2005, Nov 12 - 18, Seattle, WA.

Experiences in Design and Implementation of a High Performance Transport Protocol, Yunhong Gu, Xinwei Hong, and Robert L. Grossman, SC 2004, Nov 6 - 12, Pittsburgh, PA.

An Analysis of AIMD Algorithms with Decreasing Increases, Yunhong Gu, Xinwei Hong and Robert L. Grossman, First Workshop on Networks for Grid Applications (Gridnets 2004), Oct. 29, San Jose, CA.

Internet Draft UDT: A Transport Protocol for Data Intensive Applications, Yunhong Gu and

Robert L. Grossman, draft-gg-udt-02.txt.

46 :: 50

udt.sourceforge.net

Commercialization

Baidu Hi Messenger Maidsafe Movie2Me by broadcasting center europe NiFTy TV Sterling File Accelerator (SFA) by Sterling Commerce Tideworks PowerFolder GridFTP etc.

47 :: 50

udt.sourceforge.net

Support and Services

Online Forum https://sourceforge.net/forum/?group_id=115059 Support both English and Chinese languages

Consulting service Dedicated service to customers’ projects Provided by UDT developers

48 :: 50

udt.sourceforge.net

Achievements

SC 2002 Bandwidth Challenge “Best Use of Emerging Network Infrastructure” Award

SC 2003 Bandwidth Challenge “Application Foundation” Award

SC 2004 Bandwidth Challenge “Best Replacement for FedEx / UDP Fairness” Award

SC 2006 Bandwidth Challenge Winner

SC 2008 Bandwidth Challenge Winner

49 :: 50

udt.sourceforge.net

Vision

Short-term A practical solution to the distributed data intensive applications in high

BDP environments

Long-term Evolve with new technologies (open source & open standard) More functionalities and support for more use scenarios Network research platform (e.g., fast prototyping and evaluation of new

control algorithms)

50 :: 50

udt.sourceforge.net

The End

Thank You!

Yunhong Gu, October 10, 2005

Updated on August 8, 2009

Recommended