60
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer Polytechnic Institute [email protected] http://www.ecse.rpi.edu/Homepages/shiv kuma Based in part upon slides of Prof. Raj Jain (OSU), Srini Seshan (CMU), J. Kurose (U Mass), I.Stoica (UCB)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Embed Size (px)

DESCRIPTION

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 3 Multiplexing / demultiplexing application transport network M P2 application transport network receiver H t H n segment M application transport network P1 MMM P3 P4 segment header application-layer data source port #dest port # 32 bits header/payload fields

Citation preview

Page 1: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

1

TCP, TCP Congestion Control and Common AQM Schemes: Quick

Revision

Shivkumar KalyanaramanRensselaer Polytechnic Institute

[email protected] http://www.ecse.rpi.edu/Homepages/shivkuma

Based in part upon slides of Prof. Raj Jain (OSU), Srini Seshan (CMU), J. Kurose (U Mass), I.Stoica (UCB)

Page 2: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

2

Quick introduction to TCP ServicesTCP Reliability Model, MechanismsTCP Congestion Control Model and

MechnismsTCP Versions: Reno, NewReno, SACK, Vegas

etcAQM schemes: common goals, RED, …

Overview

Page 3: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

3

Multiplexing / demultiplexing

applicationtransportnetwork

M P2applicationtransportnetwork

receiver

HtHnsegment

segment Mapplicationtransportnetwork

P1M

M MP3 P4

segmentheader

application-layerdata

source port # dest port #

32 bits

header/payload fields

Page 4: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

4

Checksum

Sender: Treat segment contents

as sequence of 16-bit integers

Checksum: addition (1’s complement sum) of segment contents

Sender puts checksum value into UDP checksum field

Receiver: Compute checksum of

received segment Check if computed

checksum equals checksum field value: NO - error detected YES - no error detected.

But maybe errors nonetheless?

Goal: detect “errors” (e.g., flipped bits) in transmitted segment (I.e., payload + header)

Note: IP only has a header checksum.

Page 5: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

5

Introduction to TCP Communication abstraction: close equivalent to

UNIX file-system interface => programmer productivity!ReliableOrderedPoint-to-pointByte-streamFull duplexFlow and congestion controlled

Page 6: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

6

TCP Header

Source port Destination port

Sequence number

Acknowledgement

Advertised windowHdrLen Flags0

Checksum Urgent pointer

Options (variable)

Data

Flags: SYNFINRESETPUSHURGACK

Page 7: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

7

Principles of Reliable Data Transfer

Characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

Page 8: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

8

Temporal Redundancy ModelPackets • Sequence Numbers

• CRC or Checksum

Status Reports • ACKs• NAKs, • SACKs• Bitmaps

• Packets• FEC information

Retransmissions

Timeout

Page 9: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

9

Types of errors and effects Forward channel bit-errors (garbled packets) Forward channel packet-errors (lost packets) Reverse channel bit-errors (garbled status reports) Reverse channel bit-errors (lost status reports)

Protocol-induced effects: Duplicate packets Duplicate status reports Out-of-order packets Out-of-order status reports Out-of-range packets/status reports (in window-based

transmissions)

Page 10: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

10

Reliability Mechanisms… Mechanisms:

Checksum: detects corruption in pkts & acks ACK: “packet correctly received” Duplicate ACK: “packet incorrectly received” Sequence number: identifies packet or ack

1-bit sequence number used both in forward & reverse channel Timeout only at sender

Provides reliable transmission over: An error-free channel A forward & reverse channel with bit-errors Detects duplicates of packets/acks NAKs eliminated Forward & reverse channel w/ packet-errors (loss)

Page 11: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

11

Example: Three-Way Handshake TCP connection-establishment: 3-way-handshake

necessary and sufficient for unambiguous setup/teardown even under conditions of loss, duplication, and delay

Page 12: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

12

Stop-and-Wait (Handshake) Efficiency

Data

Ack

Ack

Data

tframe

tprop

=tprop

tframe

=Distance/Speed of SignalFrame size /Bit rate

=Distance Bit rateFrame size Speed of Signal

=1

2 + 1

U=2tprop+tframe

tframe

U

Light in vacuum = 300 m/sLight in fiber = 200 m/sElectricity = 250 m/s

No loss or bit-errors!

Page 13: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

13

“Sliding Window” Protocols

Data

Ack

tframe

tprop

U=Ntframe

2tprop+tframe

=

N2+1

1 if N>2+1

Note: no loss or bit-errors!

Page 14: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

14

ReceiverSender

Sliding Window: Details

… …

Sent & Acked Sent Not Acked

OK to Send Not Usable

… …

Max acceptable

Receiver window

Max ACK received Next seqnum

Received & Acked Acceptable Packet

Not Usable

Sender window

Next expected

Page 15: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

15

acknowledged sent to be sentoutside window

Source Port Dest. PortSequence NumberAcknowledgment

HL/Flags WindowD. Checksum Urgent Pointer

Options..

Source Port Dest. PortSequence NumberAcknowledgment

HL/Flags WindowD. Checksum Urgent Pointer

Options..

Packet Sent Packet Received

App write

Window Flow Control: Header

Page 16: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

16

Go-Back-N Retransmission If you hear of packet loss, retransmit the whole window! k-bit seq # in pkt header

Allows upto N = 2k – 1 packets in-flight, unacked ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”

Sender may receive duplicate ACKs Robust to ack losses on the reverse channel Can pinpoint the first packet lost, but cannot identify blocks of lost packets in window

One timer for oldest-in-flight pkt

Page 17: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

17

Selective Repeat: Sender, Receiver Windows

Page 18: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

18

Timeout and RTT Estimation Problem:

Unlike a physical link, the RTT of a logical link can vary, quite substantially

How long should timeout be ?Too long => underutilizationToo short => wasteful retransmissions

Solution: adaptive timeout: based on a good estimate of maximum current value of RTT

Page 19: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

19

How to estimate max RTT? RTT = prop + queuing delay

Queuing delay highly variable So, different samples of RTTs will give different

random values of queuing delay Chebyshev’s Theorem:

MaxRTT = Avg RTT + k*Deviation Error probability is less than 1/(k**2) Result true for ANY distribution of samples

In TCP: Timeout = AverageRTT + 4*Deviation Rounded up to timer granularity (50-500 ms)

Page 20: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

20

Recap: Stability of a Multiplexed SystemAverage Input Rate > Average Output Rate

=> system is unstable!

How to ensure stability ?1. Reserve enough capacity so that

demand is less than reserved capacity 2. Dynamically detect overload and adapt

either the demand or capacity to resolve overload

Page 21: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

21

Congestion Problem in Packet Switching

A

B

C10 MbsEthernet

1.5 Mbs

45 Mbs

D E

statistical multiplexing

queue of packetswaiting for output

link

If capacity is sized to be less than peak demand (statistical muxing!), need to either reserve resources or dynamically detect/adapt to overload for stability

Page 22: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

22

Congestion: A Close-up View

knee – point after which throughput increases very

slowly delay increases fast

cliff – point after which throughput starts to decrease

very fast to zero (congestion collapse)

delay approaches infinity

Note (in an M/M/1 queue) delay = 1/(1 – utilization)

Load

Load

Thro

ughp

utD

elay

knee cliff

congestioncollapse

packetloss

Page 23: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

23

Congestion Control vs. Congestion Avoidance

Congestion control goalstay left of cliff

Congestion avoidance goalstay left of knee

Right of cliff: Congestion collapse

Increase in network load results in decrease of useful work done

Load

Thro

ughp

ut

knee cliff

congestioncollapse

Page 24: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

24

End-to-End Congestion Control 1. End-to-end model:

End-system estimate the timing and degree of congestion and reduces its demand appropriately

Intermediate nodes relied upon to send timely and appropriate penalty indications (eg: packet loss rate) during congestion

Enhanced routers could send more accurate congestion signals (eg: early congestion notifications, I.e. ECNs)

Key: trust and complexity resides at end-systems Issue: What about misbehaving flows?

Page 25: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

25

Packet Conservation: Self-clockingPrPb

Ar

Ab

ReceiverSender

As

Implications of ack-clocking: More batching of acks => bursty traffic Less batching leads to a large fraction of Internet traffic being just acks (overhead)

Page 26: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

26

Additive Increase/Multiplicative Decrease (AIMD) Policy

Assumption: decrease policy must (at minimum) reverse the load increase over-and-above efficiency line Implication: decrease factor should be conservatively set to account for any

congestion detection lags etc

x0

x1

x2

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

Page 27: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

27

TCP Congestion Control Maintains three variables:

cwnd – congestion window rcv_win – receiver advertised window ssthresh – threshold size (used to update

cwnd) Rough estimate of knee point…

For sending use: win = min(rcv_win, cwnd)

Page 28: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

28

TCP: Slow Start Whenever starting traffic on a new connection, or

whenever increasing traffic after congestion was experienced:

Set cwnd =1 Each time a segment is acknowledged

increment cwnd by one (cwnd++).

Does Slow Start increment slowly? Not really. In fact, the increase of cwnd is exponential!! Window increases to W in RTT * log2(W)

Page 29: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

29

Slow Start Example The congestion

window size grows very rapidly

TCP slows down the increase of cwnd when cwnd >= ssthresh

ACK for segment 1

segment 1cwnd = 1

cwnd = 2 segment 2segment 3

ACK for segments 2 + 3

cwnd = 4 segment 4segment 5segment 6segment 7

ACK for segments 4+5+6+7

cwnd = 8

Page 30: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

30

Slow Start Sequence Plot

Time

Sequence No

.

.

.

Window doubles every round

Page 31: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

31

Congestion Avoidance Goal: maintain operating point at the left of the cliff: How?

additive increase: starting from the rough estimate (ssthresh), slowly increase cwnd to probe for additional available bandwidth

multiplicative decrease: cut congestion window size aggressively if a loss is detected.

If cwnd > ssthresh then each time a segment is acknowledged increment cwnd by 1/cwnd

i.e. (cwnd += 1/cwnd).

Page 32: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

32

Congestion Avoidance Sequence Plot

Time

Sequence No Window growsby 1 every round

Page 33: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

33

Slow Start/Congestion Avoidance Eg. Assume that

ssthresh = 8cwnd = 1

cwnd = 2

cwnd = 4

cwnd = 8

cwnd = 9

cwnd = 10

02468

101214

t=0 t=2 t=4 t=6

Roundtrip times

Cw

nd (i

n se

gmen

ts)

ssthresh

Page 34: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

34

Putting Everything Together:TCP Pseudo-code

Initially:cwnd = 1;ssthresh = infinite;

New ack received:if (cwnd < ssthresh) /* Slow Start*/ cwnd = cwnd + 1;else /* Congestion Avoidance */ cwnd = cwnd + 1/cwnd;

Timeout: (loss detection)/* Multiplicative decrease */ssthresh = win/2;cwnd = 1;

while (next < unack + win)transmit next packet;

where win = min(cwnd, flow_win);

unack next

win

seq #

Page 35: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

35

The big picture

Time

cwnd

Timeout

Slow Start

CongestionAvoidance

Page 36: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

36

Packet Loss Detection: Timeout Avoidance

Wait for Retransmission Time Out (RTO) What’s the problem with this?

Because RTO is a performance killer In BSD TCP implementation, RTO is usually more than 1

second the granularity of RTT estimate is 500 ms retransmission timeout is at least two times of RTT

Solution: Don’t wait for RTO to expire Use alternate mechanism for loss detection Fall back to RTO only if these alternate mechanisms

fail.

Page 37: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

37

Fast Retransmit Resend a segment after

3 duplicate ACKsRecall: a duplicate

ACK means that an out-of sequence segment was received

Notes: duplicate ACKs due

packet reordering! if window is small

don’t get duplicate ACKs!

ACK 1

segment 1cwnd = 1

cwnd = 2 segment 2segment 3

ACK 3cwnd = 4 segment 4

segment 5segment 6segment 7

ACK 1

3 duplicateACKs

ACK 4

ACK 4

ACK 4

Page 38: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

38

Fast Recovery (Simplified) After a fast-retransmit set cwnd to ssthresh/2

i.e., don’t reset cwnd to 1 But when RTO expires still do cwnd = 1

Fast Retransmit and Fast Recovery implemented by TCP Reno; most widely used version of TCP today

Page 39: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

39

Fast Retransmit and Fast Recovery

Retransmit after 3 duplicated acks prevent expensive timeouts

No need to slow start again At steady state, cwnd oscillates around the

optimal window size.

Time

cwnd

Slow Start

CongestionAvoidance

Page 40: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

40

Fast Retransmit

Time

Sequence NoDuplicate Acks

RetransmissionX

Page 41: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

41

Multiple Losses

Time

Sequence NoDuplicate Acks

RetransmissionX

X

XX Now what? (TCP Versions)

Page 42: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

42

Time

Sequence NoX

X

XX

TCP Versions: TahoeTahoe: set window to 1, and do slow start! No timeout…

Page 43: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

43

TCP Versions: Reno

Time

Sequence NoX

X

XX

Now what? - timeout

Reno: Recover 1 packet loss ok, but multiple loss => timeout

Page 44: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

44

TCP Reno (Jacobson 1990)

SStime

window

CA

SS: Slow StartCA: Congestion Avoidance Fast retransmission/fast recovery

Page 45: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

45

NewReno

Time

Sequence NoX

X

XX

Now what? – partial ackrecovery

NewReno: Recover multiple losses in successive RTTs using notion of partial ack”. No timeout.

Page 46: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

46

SACK Basic problem is that cumulative acks only provide little

information Alt: Selective Ack for just the packet received

What if selective acks are lost? carry cumulative ack also!

Implementation: Bitmask of packets received Selective acknowledgement (SACK) Only provided as an optimization for retransmission Fall back to cumulative acks to guarantee correctness

and window updates

Page 47: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

47

SACK

Time

Sequence NoX

X

XX

Now what? – sendretransmissions as soonas detected

Page 48: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

48

TCP Congestion Control Summary Sliding window limited by receiver window. Dynamic windows: slow start (exponential rise),

congestion avoidance (additive rise), multiplicative decrease. Ack clocking

Adaptive timeout: need mean RTT & deviation Timer backoff and Karn’s algo during retransmission

Go-back-N or Selective retransmission Cumulative and Selective acknowledgements Timeout avoidance: Fast Retransmit

Page 49: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

49

Queuing Disciplines Each router must implement some queuing discipline Queuing allocates bandwidth and buffer space:

Bandwidth: which packet to serve next (scheduling) Buffer space: which packet to drop next (buff mgmt)

Queuing also affects latency

Class C

Class BClass A

Traffic Classes

Traffic Sources

DropScheduling Buffer Management

Page 50: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

50

Typical Internet Queuing FIFO + drop-tail

Simplest choice Used widely in the Internet

FIFO (first-in-first-out) Implies single class of traffic

Drop-tail Arriving packets get dropped when queue is full

regardless of flow or importance Important distinction:

FIFO: scheduling discipline Drop-tail: drop (buffer management) policy

Page 51: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

51

FIFO + Drop-tail Problems FIFO Issues: In a FIFO discipline, the service seen by a flow

is convoluted with the arrivals of packets from all other flows! No isolation between flows: full burden on e2e control No policing: send more packets get more service

Drop-tail issues: Routers are forced to have have large queues to maintain

high utilizations Larger buffers => larger steady state queues/delays Synchronization: end hosts react to same events

because packets tend to be lost in bursts Lock-out: a side effect of burstiness and synchronization

is that a few flows can monopolize queue space

Page 52: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

52

Queue Management Ideas Synchronization, lock-out:

Random drop: drop a randomly chosen packet Drop front: drop packet from head of queue

High steady-state queuing vs burstiness: Early drop: Drop packets before queue full Do not drop packets “too early” because queue may reflect only

burstiness and not true overload Misbehaving vs Fragile flows:

Drop packets proportional to queue occupancy of flow Try to protect fragile flows from packet loss (eg: color them or

classify them on the fly) Drop packets vs Mark packets:

Dropping packets interacts w/ reliability mechanisms Mark packets: need to trust end-systems to respond!

Page 53: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

53

Packet Drop Dimensions

AggregationPer-connection state Single class

Drop positionHead Tail

Random location

Class-based queuing

Early drop Overflow drop

Page 54: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

54

Random Early Detection (RED)Min threshMax thresh

Average Queue Length

minth maxth

maxP

1.0

Avg queue length

P(drop)

Page 55: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

55

Random Early Detection (RED) Maintain running average of queue length

Low pass filtering If avg Q < minth do nothing

Low queuing, send packets through If avg Q > maxth, drop packet

Protection from misbehaving sources Else mark (or drop) packet in a manner proportional to

queue length & bias to protect against synchronization Pb = maxp(avg - minth) / (maxth - minth) Further, bias Pb by history of unmarked packets Pa = Pb/(1 - count*Pb)

Page 56: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

56

RED Issues Issues:

Breaks synchronization well Extremely sensitive to parameter settings Wild queue oscillations upon load changes Fail to prevent buffer overflow as #sources increases Does not help fragile flows (eg: small window flows or

retransmitted packets) Does not adequately isolate cooperative flows from non-

cooperative flows Isolation:

Fair queuing achieves isolation using per-flow state RED penalty box: Monitor history for packet drops,

identify flows that use disproportionate bandwidth

Page 57: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

57

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Link congestion measure

Link

mar

king

pro

babi

lity

REM Athuraliya & Low 2000

Main ideasDecouple congestion & performance measure “Price” adjusted to match rate and clear bufferMarking probability exponential in `price’

REM RED

Avg queue

1

Page 58: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

58

Comparison of AQM Performance

DropTailqueue = 94%

REDmin_th = 10 pktsmax_th = 40 pktsmax_p = 0.1

REMqueue = 1.5 pktsutilization = 92% = 0.05, = 0.4, = 1.15

Page 59: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

59

Area = 2w2/3

What is TCP Throughput?

Each cycle delivers 2w2/3 packets Assume: each cycle delivers 1/p packets = 2w2/3

Delivers 1/p packets followed by a drop=> Loss probability = p/(1+p) ~ p if p is small.

Hence

t

window

2w/3

w = (4w/3+2w/3)/2

4w/3

2w/3

pw 2/3

Page 60: Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP, TCP Congestion Control and Common AQM Schemes: Quick Revision Shivkumar Kalyanaraman Rensselaer

Shivkumar KalyanaramanRensselaer Polytechnic Institute

60

Law

Equilibrium window size

Equilibrium rate

Empirically constant a ~ 1 Verified extensively through simulations and on Internet References

T.J.Ott, J.H.B. Kemperman and M.Mathis (1996)M.Mathis, J.Semke, J.Mahdavi, T.Ott (1997)T.V.Lakshman and U.Mahdow (1997)J.Padhye, V.Firoiu, D.Towsley, J.Kurose (1998)

p1

paw

s

pDax

s

s