67
1 © From Computer Networking, by Kurose&Ross Transport Layer 3-1 Introduction to Computer Networking Guy Leduc Chapter 3 Transport Layer Computer Networking: A Top Down Approach, 7 th edition. Jim Kurose, Keith Ross Addison-Wesley, April 2016 © From Computer Networking, by Kurose&Ross Transport Layer 3-2 Chapter 3: Transport Layer Our goals: understand principles behind transport layer services: multiplexing/ demultiplexing reliable data transfer flow control congestion control learn about Internet transport layer protocols: UDP: connectionless transport TCP: connection-oriented reliable transport TCP congestion control

Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

1

© From Computer Networking, by Kurose&Ross Transport Layer 3-1

Introduction to Computer Networking

Guy Leduc

Chapter 3 Transport Layer Computer Networking: A

Top Down Approach, 7th edition.

Jim Kurose, Keith RossAddison-Wesley, April

2016

© From Computer Networking, by Kurose&Ross Transport Layer 3-2

Chapter 3: Transport LayerOur goals: ❒  understand principles

behind transport layer services:❍  multiplexing/

demultiplexing❍  reliable data transfer❍  flow control❍  congestion control

❒  learn about Internet transport layer protocols:❍  UDP: connectionless

transport❍  TCP: connection-oriented

reliable transport❍  TCP congestion control

Page 2: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

2

© From Computer Networking, by Kurose&Ross Transport Layer 3-3

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-4

Transport services and protocols❒  provide logical communication

between app processes running on different hosts

❒  transport protocols run in end systems ❍  send side: breaks app

messages into segments, passes to network layer

❍  rcv side: reassembles segments into messages, passes to app layer

❒  more than one transport protocol available to apps❍  Internet: mainly TCP and

UDP

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

Page 3: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

3

© From Computer Networking, by Kurose&Ross Transport Layer 3-5

Transport vs. network layer

❒  network layer: logical communication between hosts

❒  transport layer: logical communication between processes ❍  relies on, enhances,

network layer services

Analogy:❒  host = institute❒  process = office❒  app messages = letters

in envelopes❒  transport protocol =

janitor who demux incoming letters to offices

❒  network-layer protocol = postal service

© From Computer Networking, by Kurose&Ross Transport Layer 3-6

Internet transport-layer protocols❒  reliable, in-order delivery

(TCP)❍  congestion control ❍  flow control❍  connection setup

❒  unreliable, unordered delivery: UDP❍  no-frills extension of “best-

effort” IP❒  services not available:

❍  delay guarantees❍  bandwidth guarantees

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

Page 4: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

4

© From Computer Networking, by Kurose&Ross Transport Layer 3-7

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-8

Multiplexing/demultiplexing

process

socket

use header info to deliverreceived segments to correct socket

demultiplexing at receiver:handle data from multiplesockets, add transport header (later used for demultiplexing)

multiplexing at sender:

transport

application

physicallink

network

P2P1

transport

application

physicallink

network

P4

transport

application

physicallink

network

P3

Page 5: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

5

© From Computer Networking, by Kurose&Ross Transport Layer 3-9

How demultiplexing works❒  host receives IP datagrams

❍  each datagram has source and destination IP addresses in its header

❍  each datagram carries one transport-layer segment

❍  each segment has source and destination port numbers in its header

❒  host uses IP addresses (in network header) & port numbers (in transport header) to direct segment to appropriate socket

source port # dest port #

32 bits

applicationdata

(message)

other header fields

Transport (TCP/UDP) segment format(which is the payload of an IP datagram)

© From Computer Networking, by Kurose&Ross Transport Layer 3-10

Connectionless (UDP) demultiplexingrecall: socket created by app

has host-local port #: DatagramSocket mySocket1

= new DatagramSocket(12534);

❒  when UDP receives segment from below:❍  checks destination port # in

segment❍  directs UDP segment to

socket with that port #

recall: when creating datagram to send into UDP socket, app must specify “remote process id”, namely

!  destination IP address!  destination port #

IP datagrams with same dest. port #, but different source IP addresses and/or source port numbers will be directed to same socket at destination

Socket interface

Page 6: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

6

© From Computer Networking, by Kurose&Ross Transport Layer 3-11

Connectionless demux: exampleDatagramSocket serverSocket = new DatagramSocket (6428);

transport

application

physicallink

network

P3transport

application

physicallink

network

P1

transport

application

physicallink

network

P4

DatagramSocket mySocket1 = new DatagramSocket (5775);

DatagramSocket mySocket2 = new DatagramSocket (9157);

source port: 9157 (set by transport)dest port: 6428 (set by P3)

source port: 6428dest port: 9157

source port: 6428dest port: 5775

source port: 5775dest port: 6428

© From Computer Networking, by Kurose&Ross Transport Layer 3-12

Connection-oriented (TCP) demux❒  Recall: Web servers have

different TCP sockets for each connecting client❍  non-persistent HTTP will

even have different sockets for each request

❒  These many simultaneous TCP sockets are all associated with the same server port #:❍  destination port # is not

enough to direct segment to appropriate socket!

❒  TCP socket must be associated with a TCP connection

❒  TCP socket thus identified by a 4-tuple: ❍  source IP address❍  source port number❍  dest IP address❍  dest port number

❒  Demux: receiver TCP uses all four values to direct segment to appropriate socket

Page 7: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

7

© From Computer Networking, by Kurose&Ross Transport Layer 3-13

Connection-oriented demux: example

transport

application

physicallink

network

P1transport

application

physicallink

P4

transport

application

physicallink

network

P2

source IP,port: A,9157dest IP,port: B,80

network

P6P5P3

source IP,port: C,5775dest IP,port: B,80

source IP,port: C,9157dest IP,port: B,80

Three segments destined for IP address B and dest port 80 are demultiplexed

to different sockets thanks to source IP and port #

client: IP address

A

client: IP address C

server: IP address B

© From Computer Networking, by Kurose&Ross Transport Layer 3-14

Connection-oriented demux: example

transport

application

physicallink

network

P1transport

application

physicallink

transport

application

physicallink

network

P2

network

P3

source IP,port: C,5775dest IP,port: B,80

source IP,port: C,9157dest IP,port: B,80

P4

threaded serverclient: IP address A

client: IP address C

server: IP address B

source IP,port: A,9157dest IP,port: B,80

Page 8: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

8

© From Computer Networking, by Kurose&Ross Transport Layer 3-15

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-16

UDP: User Datagram Protocol [RFC 768]

❒  “no frills,” “bare bones” Internet transport protocol

❒  “best effort” service, UDP segments may be:❍  lost❍  delivered out of order to

app❒  connectionless:

❍  no handshaking between UDP sender, receiver

❍  each UDP segment handled independently of others

"  UDP use:o  multimedia apps (loss

tolerant, rate sensitive)o  DNSo  SNMP

"  reliable transfer over UDP: o  add reliability at

application layero  application-specific error

recovery!

Page 9: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

9

© From Computer Networking, by Kurose&Ross Transport Layer 3-17

UDP: segment header

source port # dest port #

32 bits

applicationdata

(payload)

UDP segment format

length checksum

length, in bytes, of UDP segment, including header

❒  no connection establishment (which can add delay)

❒  simple: no connection state at sender, receiver

❒  small header size❒  no congestion control: UDP

can blast away as fast as desired

why is there a UDP?

© From Computer Networking, by Kurose&Ross Transport Layer 3-18

UDP checksum

Sender:❒  treat segment contents,

including header fields, as sequence of 16-bit integers

❒  checksum: addition (one’s complement sum) of segment contents

❒  sender puts checksum value into UDP checksum field

Receiver:❒  compute checksum of received

segment❒  check if computed checksum

equals checksum field value:❍  NO - error detected❍  YES - no error detected. But

maybe errors nonetheless?

More later ….

Goal: detect “errors” (e.g., flipped bits) in transmitted segment

Page 10: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

10

© From Computer Networking, by Kurose&Ross Transport Layer 3-19

Internet checksum: example

example: add two 16-bit integers

1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

wraparound

sumchecksum

Note: when adding numbers, a carryout from the most significant bit needs to be added to the result

© From Computer Networking, by Kurose&Ross Transport Layer 3-20

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer❍  Alternating bit protocol❍  Pipelined protocols

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

Page 11: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

11

© From Computer Networking, by Kurose&Ross Transport Layer 3-21

Principles of Reliable data transfer❒  important in application, transport, link layers❒  top-10 list of important networking topics!

❒  characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

© From Computer Networking, by Kurose&Ross Transport Layer 3-22

Principles of Reliable data transfer❒  important in application, transport, link layers❒  top-10 list of important networking topics!

❒  characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

Page 12: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

12

© From Computer Networking, by Kurose&Ross Transport Layer 3-23

Principles of Reliable data transfer❒  important in application, transport, link layers❒  top-10 list of important networking topics!

❒  characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

© From Computer Networking, by Kurose&Ross Transport Layer 3-24

Reliable data transfer: getting started

sendside

receiveside

rdt_send(): called from above, (e.g., by app.). Passed data to

deliver to receiver upper layer

udt_send(): called by rdt,to transfer packet over

unreliable channel to receiver

rdt_rcv(): called when packet arrives on rcv-side of channel

deliver_data(): called by rdt to deliver data to upper

Page 13: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

13

© From Computer Networking, by Kurose&Ross Transport Layer 3-25

Reliable data transfer: getting startedWe’ll:❒  incrementally develop sender, receiver sides of

reliable data transfer protocol (rdt)❒  consider only unidirectional data transfer

❍  but control info will flow on both directions!❒  use finite state machines (FSM) to specify

sender, receiver

state1

state2

event causing state transitionactions taken on state transition

state: when in this “state” next state uniquely determined by next

eventeventactions

© From Computer Networking, by Kurose&Ross Transport Layer 3-26

Rdt1.0: reliable transfer over a reliable channel

❒  assume: underlying channel perfectly reliable❍  no bit errors❍  no loss of packets

❒  separate FSMs for sender, receiver:❍  sender sends data into underlying channel❍  receiver reads data from underlying channel

Wait for call from

above packet = make_pkt(data)udt_send(packet)

rdt_send(data)data = extract (packet)deliver_data(data)

Wait for call from

below

rdt_rcv(packet)

sender receiver

Page 14: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

14

© From Computer Networking, by Kurose&Ross Transport Layer 3-27

❒  assume: underlying channel may flip bits in packet❍  checksum to detect bit errors

❒  the question: how to recover from errors?

rdt2.0: channel with bit errors

How do humans recover from “errors”during conversation?

© From Computer Networking, by Kurose&Ross Transport Layer 3-28

❍ acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK

❍ negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors

❍  sender retransmits pkt on receipt of NAK

❒  new mechanisms in rdt2.0 (beyond rdt1.0):❍ error detection❍  feedback: control msgs (ACK,NAK)

from receiver to sender

rdt2.0: channel with bit errors

pkt A

pkt B

pkt B

ACK

NAK

ACK OK

Error

OK

❒  assume: underlying channel may flip bits in packet❍  checksum to detect bit errors

❒  the question: how to recover from errors:

Page 15: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

15

© From Computer Networking, by Kurose&Ross Transport Layer 3-29

rdt2.0: FSM specification

Wait for call from

above

sndpkt = make_pkt(data, checksum)udt_send(sndpkt)

data = extract(rcvpkt)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

belowsender

receiverrdt_send(data)

Λ

© From Computer Networking, by Kurose&Ross Transport Layer 3-30

rdt2.0: operation with no errors

Wait for call from

above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

Λ

Page 16: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

16

© From Computer Networking, by Kurose&Ross Transport Layer 3-31

rdt2.0: error scenario

Wait for call from

above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

Λ

© From Computer Networking, by Kurose&Ross Transport Layer 3-32

rdt2.0 has a fatal flaw!What happens if ACK/NAK

corrupted?❒  garbled ACK/NAK detected by

checksum too❒  garbled ACK/NAK discarded❒  but sender doesn’t know what

happened at receiver!❒  can’t just retransmit: possible

duplicate

Handling duplicates: ❒  sender retransmits current

pkt if ACK/NAK garbled❒  sender adds sequence

number to each pkt❒  receiver discards (doesn’t

deliver up) duplicate pkt

Sender sends one packet, then waits for receiver response

stop and waitpkt

same pkt

garbled ACK

Duplicate!

OK

Page 17: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

17

© From Computer Networking, by Kurose&Ross Transport Layer 3-33

rdt2.1: sender, handles garbled ACK/NAKs

Wait for call 0 from

above

sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)

rdt_send(data)

Wait for ACK or NAK 0 udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isNAK(rcvpkt) )

sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)

rdt_send(data)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && isACK(rcvpkt)

Wait for call 1 from

above

Wait for ACK or NAK 1

ΛΛ

© From Computer Networking, by Kurose&Ross Transport Layer 3-34

rdt2.1: receiver, handles garbled ACK/NAKs

Wait for 0 from below

sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)

data = extract(rcvpkt)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

Wait for 1 from below

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt) data = extract(rcvpkt)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)

Note: receiver sends ACK in this case.

Why?

Page 18: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

18

© From Computer Networking, by Kurose&Ross Transport Layer 3-35

rdt2.1: discussion

Sender:❒  seq # added to pkt❒  two seq. #’s (0,1) will

suffice. Why?❒  must check if received

ACK/NAK corrupted ❒  twice as many states

❍  state must “remember” whether “current” pkt has seq. # of 0 or 1

Receiver:❒  must check if received

packet is duplicate❍  state indicates whether 0

or 1 is expected pkt seq #

© From Computer Networking, by Kurose&Ross Transport Layer 3-36

rdt3.0: channels with errors and loss

New assumption: underlying channel can also lose packets (data or ACKs)❍  checksum, seq. #, ACKs,

retransmissions will be of help… but not enough

Approach: sender waits “reasonable” amount of time for ACK

❒  retransmits if no ACK received in this time

❒  requires countdown timer

pkt(0)

pkt(0)ACK

Timeout

pkt(0)

pkt(0)ACK

ACKTimeout Discard but ACK!

No duplicate

X X

Page 19: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

19

Transport Layer 3-37

rdt3.1: actually no need for NAKs!❒  Up to now:

❍  Timer for packet loss❍  NAK for packet errors

❒  Simpler:❍  Timer for both packet loss

and errors!❒  NAK would improve recovery

time, but it’s not our concern here!

pkt(0)

pkt(1)

pkt(1)

ACK

NAK

ACK

rdt3.0: With ACKs & NAKs

pkt(0)

pkt(1)

pkt(1)

ACK

ACK

rdt3.1: With ACKs only

Timeout

OK

OK

Error

OK

OK

Error

Transport Layer 3-38

Flaw in rdt3.1: Delayed packets or ACKs

❒  If pkt (or ACK) just delayed (not lost):❍  retransmission will be duplicate, but use

of seq. #’s already handles this

pkt(0)

pkt(0)

ACK

pkt(0)

pkt(1)ACK

ACKDiscard (no dupl)Resend ACK

❒  However, race conditions are possible between the received ACK and the retransmitted packet!

Discard again,Still expects #1!

B ACKedbut lost!

C ACKedbut lost!

X

Timeout

A

B

C

A received

B not recovered

C not recovered

Note: such race conditions are only possible over full-duplex channels

Page 20: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

20

Transport Layer 3-39

rdt3.2: adding seq # in ACKs

❒  receiver must specify seq # of pkt being ACKed

pkt(0)

pkt(0)

ACK(1)

pkt(1)

pkt(1)ACK(0)

ACK(0)Discard (no dupl)Resend ACK

B not ACKed

X

Timeout

A

B

A received

B recoveredTimeout

© From Computer Networking, by Kurose&Ross Transport Layer 3-40

rdt3.2 sendersndpkt = make_pkt(0, data, chksum)udt_send(sndpkt)start_timer

rdt_send(data)

Wait for

ACK0

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,1) )

Wait for call 1 from

above

sndpkt = make_pkt(1, data, chksum)udt_send(sndpkt)start_timer

rdt_send(data)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && isACK(rcvpkt,0)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,0) )

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && isACK(rcvpkt,1)

stop_timerstop_timer

udt_send(sndpkt)start_timer

timeout

udt_send(sndpkt)start_timer

timeout

rdt_rcv(rcvpkt)

Wait for call 0from

above

Wait for

ACK1

Λrdt_rcv(rcvpkt)

ΛΛ

Λ

Page 21: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

21

© From Computer Networking, by Kurose&Ross Transport Layer 3-41

sender receiver

rcv pkt(1)

rcv pkt(0)

send ack(0)

send ack(1)

send ack(0)

rcv ack(0)

send pkt(0)

send pkt(1)

rcv ack(1)

send pkt(0)rcv pkt(0)

pkt(0)

pkt(0)

pkt(1)

ack(1)

ack(0)

ack(0)

(a) no loss

sender receiver

rcv pkt(1)

rcv pkt(0)

send ack(0)

send ack(1)

send ack(0)

rcv ack(0)

send pkt(0)

send pkt(1)

rcv ack(1)

send pkt(0)rcv pkt(0)

pkt(0)

pkt(0)

ack(1)

ack(0)

ack(0)

(b) packet loss

pkt(1)X

loss

pkt(1)timeoutresend pkt(1)

rdt3.2 in action: Alternating-Bit Protocol (1969)

© From Computer Networking, by Kurose&Ross Transport Layer 3-42

rdt3.2 in action

rcv pkt(1)send ack(1)

(detect duplicate)

pkt(1)

sender receiver

rcv pkt(1)

rcv pkt(0)

send ack(0)

send ack(1)

send ack(0)

rcv ack(0)

send pkt(0)

send pkt(1)

rcv ack(1)

send pkt(0)rcv pkt(0)

pkt(0)

pkt(0)

ack(1)

ack(0)

ack(0)

(c) ACK loss

ack(1)X

loss

pkt(1)timeout

resend pkt(1)

rcv pkt(1)send ack(1)

(detect duplicate)

pkt(1)

sender receiver

rcv pkt(1)

send ack(0)rcv ack(0)

send pkt(1)

send pkt(0)rcv pkt(0)

pkt(0)

ack(0)

(d) premature timeout/ delayed ACK

pkt(1)timeout

resend pkt(1)

ack(1)

send ack(1)

send pkt(0)rcv ack(1)

ack(1)send pkt(0)rcv ack(1) pkt(0)

rcv pkt(0)send ack(0)ack(0)

Page 22: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

22

Transport Layer 3-43

rdt3.2 still incorrect if pkt or ACK reordering is possible

pkt(0)

pkt(0)

ACK(0)

pkt(1)

ACK(0)

ACK(1)

Timeouton pkt(0)

Duplicate!But consideredas new pkt(0)!pkt(0)pkt(0)

ACKedbut lost!

pkt reordering

Solution: Choose timeout so large that when a pkt is retransmitted the sender is sure that the previous copy of this pkt and its ACK have disappeared from the network.Better solution: Use a much larger seq# space (see later).

X

Can be arbitrarily small

© From Computer Networking, by Kurose&Ross Transport Layer 3-44

Performance of rdt3.2: stop-and-wait operation

first packet bit transmitted, t = 0sender receiver

RTT

last packet bit transmitted, t = L / R

first packet bit arriveslast packet bit arrives, send ACK

ACK arrives, send next packet, t = RTT + L / R

Usender =LR

RTT + LR=

1

1+R ⋅ RTTL

R . RTT/2 = Bandwidth-delay product= “in-flight” bits

U sender : utilization = fraction of time sender busy sending

Page 23: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

23

© From Computer Networking, by Kurose&Ross Transport Layer 3-45

Performance of rdt3.2❒  Was OK over local low-speed networks, but…❒  Example: 1 Gbps link, 15 ms end-to-end prop. delay, 1KB packet:

T transmit = 8 103 bits109 bps = 8 µsec

U sender = .008

30.008 = 0.00027

microseconds

L / R RTT + L / R

=

L (packet length in bits)R (transmission rate, bps) =

❍  1 KByte packet every 30 msec -> 266.7 kbps throughput over 1 Gbps link

❍  network protocol limits use of physical resources!

© From Computer Networking, by Kurose&Ross Transport Layer 3-46

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer❍  Alternating bit protocol❍  Pipelined protocols

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

Page 24: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

24

© From Computer Networking, by Kurose&Ross Transport Layer 3-47

Pipelined protocolsPipelining: sender allows multiple, “in-flight”, yet-to-be-

acknowledged pkts❍  range of sequence numbers must be increased❍  buffering at sender and/or receiver

❒  Two generic forms of pipelined protocols: go-Back-N, selective repeat

© From Computer Networking, by Kurose&Ross Transport Layer 3-48

Pipelining: increased utilization

first packet bit transmitted, t = 0sender receiver

RTT

last bit transmitted, t = L / R

first packet bit arriveslast packet bit arrives, send ACK

ACK arrives, send next packet, t = RTT + L / R

last bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK

U sender = .024

30.008 = 0.0008

microseconds

3 * L / R RTT + L / R

=

3-packet pipelining increases utilization by a factor of 3!

Page 25: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

25

© From Computer Networking, by Kurose&Ross Transport Layer 3-49

Pipelined protocols: overview

Go-back-N:❒  Sender can have up to

N unacked packets in pipeline

❒  Receiver only sends cumulative ACKs❍  doesn’t ACK pkt if there’s

a gap❒  Sender has timer for

oldest unacked pkt❍  when timer expires,

retransmit all unacked packets

Selective Repeat:❒  Sender can have up to N

unacked packets in pipeline❒  Receiver sends individual

ACKs for each packet

❒  Sender maintains timer for each unacked packet❍  when timer expires, retransmit

only that unacked packet

© From Computer Networking, by Kurose&Ross Transport Layer 3-50

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer❍  Alternating bit protocol❍  Pipelined protocols

•  GBN•  SR

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

Page 26: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

26

© From Computer Networking, by Kurose&Ross Transport Layer 3-51

Go-Back-N (GBN): sender❒  k-bit seq # in pkt header❒  “window” of up to N, consecutive unacked packets allowed

❒  ACK(n): ACKs all packets up to, including seq # n - “cumulative ACK”❍  may receive duplicate ACKs (see receiver)

❒  single timer: conceptually for the oldest in-flight packet (i.e., sent but unacked packet)

❒  Timeout(n): retransmit packet n and all higher seq # packets in window

© From Computer Networking, by Kurose&Ross Transport Layer 3-52

GBN: sender extended FSM

Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])…udt_send(sndpkt[nextseqnum-1])

timeout

rdt_send(data)

if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) if (base == nextseqnum) start_timer nextseqnum++ }else refuse_data(data)

base = getacknum(rcvpkt)+1If (base == nextseqnum) stop_timer else start_timer

rdt_rcv(rcvpkt) && not corrupt(rcvpkt)

base=1nextseqnum=1

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Λ

Λ

Page 27: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

27

© From Computer Networking, by Kurose&Ross Transport Layer 3-53

GBN: receiver extended FSM

❒  ACK-only: always send ACK for correctly-received packet with highest in-order seq #❍  need only remember expectedseqnum

❒  out-of-order packet: ❍  discard (don’t buffer) -> no receiver buffering!❍  re-ACK packet with highest in-order seq #

•  will generate duplicate ACKs

Wait

udt_send(sndpkt)default

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && hasseqnum(rcvpkt,expectedseqnum) data = extract(rcvpkt)deliver_data(data)sndpkt = make_pkt(expectedseqnum,ACK,chksum)udt_send(sndpkt)expectedseqnum++

expectedseqnum=1sndpkt = make_pkt(0,ACK,chksum)

Λ

© From Computer Networking, by Kurose&Ross Transport Layer 3-54

GBN in action

send pkt0send pkt1send pkt2send pkt3

(wait)

sender receiver

receive pkt0, send ack0receive pkt1, send ack1

receive pkt3, discard, (re)send ack1rcv ack0, send pkt4

rcv ack1, send pkt5

pkt 2 timeoutsend pkt2send pkt3send pkt4send pkt5

Xloss

receive pkt4, discard, (re)send ack1receive pkt5, discard, (re)send ack1

rcv pkt2, deliver, send ack2rcv pkt3, deliver, send ack3rcv pkt4, deliver, send ack4rcv pkt5, deliver, send ack5

ignore duplicate ACK

0 1 2 3 4 5 6 7 8

sender window (N=4)

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

Page 28: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

28

Transport Layer 3-55

Maximum window size with GBN❒  If seq# size = K (“The modulo”)❒  Q: What is the maximum window size N to preserve correctness?

❍  Obviously N cannot exceed K, why?❍  Can N be equal to K?

pkt0pkt1pkt2pkt3

Suppose K = 4, N =4

W=[0..3]

XXXX

Duplicates!

All ACKs lost

W=[0]W=[1]W=[2]W=[3]

W=[0]

All packets received in order, and delivered again to app!

resend pkt0resend pkt1resend pkt2resend pkt3

Transport Layer 3-56

GBN: maximum window size

❒  Constraint: window size (N) ≤ seq# size (K) – 1❒  It is necessary and sufficient to achieve correctness

when there is no packet reordering in the network

pkt0pkt1pkt2

Now with K = 4, N =3

W=[0..2]

XXX

resend pkt0resend pkt1resend pkt2

All ACKs lost

W=[0]W=[1]W=[2]

W=[3]

All packets are now discarded!And ACKs sent again

Page 29: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

29

© From Computer Networking, by Kurose&Ross Transport Layer 3-57

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer❍  Alternating bit protocol❍  Pipelined protocols

•  GBN•  SR

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-58

Selective Repeat (SR)

❒  receiver individually acknowledges all correctly received packets❍  buffers packets, as needed, for eventual in-order delivery

to upper layer❒  sender only resends packets for which ACK not

received❍  sender timer for each unacked packet

❒  sender window❍  N consecutive seq #’s❍  again limits seq #s of sent, unacked packets

Page 30: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

30

© From Computer Networking, by Kurose&Ross Transport Layer 3-59

SR: sender, receiver windows

Relative positions of both windows

Transport Layer 3-60

Sender:

Receiver:

Initially:

Sender:

Receiver:

When the first packets have been received,but their ACKs are not received (yet):

Are the following relative positions possible? Why?

Sender:

Receiver:

Sender:

Receiver:

Sender:

Receiver:

Page 31: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

31

© From Computer Networking, by Kurose&Ross Transport Layer 3-61

Selective repeat (SR)

data from above :❒  if next available seq # in

window, send packettimeout(n):❒  resend packet n, ❒  restart timer (n)ACK(n) in [sendbase,sendbase+N-1]:

❒  mark packet n as received❒  if n is smallest unacked

packet, advance window base to next unacked seq #

senderpkt n in [rcvbase, rcvbase+N-1]

❒  send ACK(n)❒  out-of-order: buffer❒  in-order: deliver (also deliver

buffered, in-order pkts), advance window to next not-yet-received pkt

pkt n in [rcvbase-N, rcvbase-1]

❒  ACK(n)otherwise: ❒  ignore

receiver

Sender can lag behind receiver, but at most N packets

© From Computer Networking, by Kurose&Ross Transport Layer 3-62

SR in action

send pkt0send pkt1send pkt2send pkt3

(wait)

sender receiver

receive pkt0, send ack0, deliver pkt0receive pkt1, send ack1 deliver pkt1receive pkt3, buffer, send ack3rcv ack0, send pkt4

rcv ack1, send pkt5

pkt 2 timeoutsend pkt2

Xloss

receive pkt4, buffer, send ack4receive pkt5, buffer, send ack5

rcv pkt2; deliver pkt2,pkt3, pkt4, pkt5; send ack2

record ack3 arrived

0 1 2 3 4 5 6 7 8

sender window (N=4)

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

record ack4 arrivedrecord ack5 arrived

Q: what happens when ack2 arrives?

Page 32: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

32

© From Computer Networking, by Kurose&Ross Transport Layer 3-63

SR: dilemma

assume: ❒  seq #’s: 0, 1, 2, 3❒  window size=3

receiver window(after receipt)

sender window(after ack)

0 1 2 3 0 1 2

0 1 2 3 0 1 20 1 2 3 0 1 2

pkt0pkt1pkt2

0 1 2 3 0 1 2 pkt0

timeoutretransmit pkt0

0 1 2 3 0 1 2

0 1 2 3 0 1 20 1 2 3 0 1 2X

XX

will accept packetwith seq number 0(b) oops!

0 1 2 3 0 1 2

0 1 2 3 0 1 20 1 2 3 0 1 2

pkt0pkt1pkt2

0 1 2 3 0 1 2pkt0

0 1 2 3 0 1 2

0 1 2 3 0 1 20 1 2 3 0 1 2

Xwill accept packetwith seq number 0

0 1 2 3 0 1 2 pkt3

(a) no problem

receiver can’t see sender side.receiver behavior identical in both cases!

something’s (very) wrong!

"  receiver sees no difference in two scenarios!

"  duplicate data accepted as new in (b)

Q: what relationship between seq # size and window size to avoid problem in (b)?

Transport Layer 3-64

Maximum window size with SR❒  If seq# size = K (“The modulo”)❒  Q: What is the maximum window size N?❒  A: N ≤ K/2

❍  is proven correct when there is no packet reordering

(a) Initial situation with a window of size N=3. (b) After 3 frames have been sent and received but not ACK’ed.

The upper side of the rec window falls in the sending window(c) Initial situation with a window of size N=2. (d) After 2 frames sent and received but not ACK’ed.

The upper side of the rec window does not fall in the sending window

Sender 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3

Receiver 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3

(a) (b) (c) (d)

Suppose K = 4, N =3 K = 4, but N =2

Retransmissions of received pkts fall

outside rec window

Page 33: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

33

Transport Layer 3-65

Maximum window size: general principle❒  Notations:

❍  seq# space = [0..K-1], i.e. modulo K❍  Maximum sender window size = Ns❍  Maximum receiver window size = Nr

❒  Without packet reordering in network, protocol is correct if and only if:❍  Ns + Nr ≤ K

❒  Particular cases:❍  GBN: Nr = 1, Ns ≤ K - 1❍  SR: Ns = Nr ≤ K/2❍  Alternating-bit: K = 2, Ns = Nr = 1

•  Special case of both GBN and Selective Repeat

Transport Layer 3-66

GBN, when pkt or ACK reordering possible

pkt0pkt1pkt2

resend pkt0

Duplicate!but consideredas new pkt(0)!

pkt reorderingAck0

pkt3

W=[0..2] W=[0]

W=[1]W=[2]W=[3]

W=[0]

W=[1..3]

With K = 4, N = 3

Solution: Choose timeout so large that when a pkt is retransmitted the sender is sure that the previous copy of this pkt and its ACK have disappeared from the network.Better solution: Use a huge seq# space (K >>), keep N much smaller than K-1, and rely on the underlying network to ensure packets and ACKs do not live too long

Page 34: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

34

Transport Layer 3-67

SR, when pkt or ACK reordering possible

pkt0pkt1

resend pkt0

Duplicate!but consideredas new pkt(0)!

pkt reordering

Solution: Choose timeout so large that when a pkt is retransmitted the sender is sure that the previous copy of this pkt and its ACK have disappeared from the network.Better solution: Use a huge seq# space (K >>), keep N much smaller than K/2, and rely on the underlying network to ensure packets and ACKs do not live too long

Ack0

pkt2

W=[0..1] W=[0..1]

W=[1..2]W=[2..3]

W=[3..0]

W=[1..2]

K = 4, N = 2

© From Computer Networking, by Kurose&Ross Transport Layer 3-68

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

Page 35: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

35

© From Computer Networking, by Kurose&Ross Transport Layer 3-69

TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581

❒  full duplex data:❍  bi-directional data flow in

same connection❍  MSS: maximum segment

size❒  connection-oriented:

❍  handshaking (exchange of control msgs) inits sender, receiver state before data exchange

❒  flow controlled:❍  sender will not overwhelm

receiver

❒  point-to-point:❍  one sender, one

receiver ❒  reliable, in-order byte

stream:❍  no “message

boundaries”❒  pipelined:

❍ TCP congestion and flow control set window size

© From Computer Networking, by Kurose&Ross Transport Layer 3-70

TCP segment structure

source port # dest port #

32 bits

applicationdata

(variable length)

sequence numberacknowledgement number

Receive windowUrg data pnterchecksum

FSRPAUheadlen

notused

Options (variable length)

URG: urgent data (generally not used)

ACK: ACK #valid

PSH: push data now(generally not used)

RST, SYN, FIN:connection estab(setup, teardown

commands)

# bytes rcvr willingto accept

countingby bytes of data(not segments!)

Internetchecksum

(as in UDP)

Page 36: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

36

© From Computer Networking, by Kurose&Ross Transport Layer 3-71

TCP seq. numbers, ACKssequence numbers:

❍  byte stream “number” of first byte in segment’s data

acknowledgements:❍  seq # of next byte expected

from other side❍  cumulative ACK❍  also the SACK option for

selective ACKsQ: how receiver handles out-of-order segments❍  A: TCP spec doesn’t say,

- up to implementorsource port # dest port #

sequence numberacknowledgement number

checksumrwnd

urg pointer

incoming segment to sender

A

sent ACKed

sent, not-yet ACKed(“in-flight”)

usablebut not yet sent

not usable

window size N

sender sequence number space

source port # dest port #

sequence numberacknowledgement number

checksumrwnd

urg pointer

outgoing segment from sender

© From Computer Networking, by Kurose&Ross Transport Layer 3-72

TCP seq. numbers, ACKs

Usertypes‘C’

host ACKs receipt of echoed‘C’

host ACKs receipt of ‘C’, echoes back ‘C’

simple telnet app scenario

Host BHost A

Seq=42, ACK=79, data = ‘C’

Seq=79, ACK=43, data = ‘C’

Seq=43, ACK=80

Page 37: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

37

© From Computer Networking, by Kurose&Ross Transport Layer 3-73

TCP Round Trip Time and TimeoutQ: how to set TCP timeout value?❒  longer than RTT

❍  but RTT varies❒  too short: premature timeout

❍  unnecessary retransmissions❒  too long: slow reaction to segment loss

0 1 2 3 4 5RTT (msec)

T

0

0.1

0.2

0.3

Prob.density

0

0.1

0.2

0.3

Prob.density

0 10 20 30 40 50RTT (msec)

T1 T2

Over a link

Over a network

From Computer Networks, by Tanenbaum © Prentice Hall

© From Computer Networking, by Kurose&Ross Transport Layer 3-74

TCP Round Trip Time and Timeout

EstimatedRTT = (1-α)*EstimatedRTT + α*SampleRTT

❒  Exponential weighted moving averageo  Influence of past samples decreases exponentially fasto  Weights of samples (backward): α, α(1-α), α(1-α)2, α(1-α)3…

❒  Typical value: α = 0.125

Q: how to estimate RTT?❒  SampleRTT: measured time from segment transmission until ACK receipt

❍  ignore retransmissions❒  SampleRTT will vary, want estimated RTT “smoother”

❍  average several recent measurements, not just current SampleRTT

Page 38: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

38

© From Computer Networking, by Kurose&Ross Transport Layer 3-75

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

100

150

200

250

300

350

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

time (seconnds)

RTT

(mill

isec

onds

)

SampleRTT Estimated RTT

RTT

(milli

seco

nds)

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

sampleRTTEstimatedRTT

time (seconds)

Example RTT estimation:

© From Computer Networking, by Kurose&Ross Transport Layer 3-76

TCP Round Trip Time, timeout

Timeout interval: EstimatedRTT plus “safety margin”❍  large variation in EstimatedRTT -> larger safety margin

Estimate SampleRTT deviation from EstimatedRTT:

DevRTT = (1-β)*DevRTT + β*|SampleRTT-EstimatedRTT|

(typically, β = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated average RTT

“safety margin”

Page 39: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

39

© From Computer Networking, by Kurose&Ross Transport Layer 3-77

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-78

TCP reliable data transfer

❒  TCP creates rdt service on top of IP’s unreliable service❍  pipelined segments❍  cumulative acks❍  single retransmission

timer❒  Retransmissions

triggered by:❍  timeout events❍  duplicate acks

❒  Let’s initially consider simplified TCP sender:❍  ignore duplicate acks❍  ignore flow control,

congestion control

Page 40: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

40

© From Computer Networking, by Kurose&Ross Transport Layer 3-79

TCP sender events:data rcvd from app:❒  create segment with

seq #❒  seq # is byte-stream

number of first data byte in segment

❒  start timer if not already running❍  think of timer as for

oldest unacked segment❍  expiration interval: TimeOutInterval

timeout:❒  retransmit (one)

segment that caused timeout

❒  restart timer Ack rcvd:❒  if ack acknowledges

previously unacked segments❍  update what is known to

be acked❍  start timer if there are

still unacked segments

© From Computer Networking, by Kurose&Ross Transport Layer 3-80

TCP sender (simplified)

waitfor

event

NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

Λ

create segment, seq. #: NextSeqNumpass segment to IP (i.e., “send”)NextSeqNum = NextSeqNum + length(data) if (timer currently not running) start timer

data received from application above

retransmit not-yet-acked segment with smallest seq. #

start timer

timeout

if (y > SendBase) { SendBase = y /* SendBase–1: last cumulatively ACKed byte */ if (there are currently not-yet-acked segments) start timer else stop timer }

ACK received, with ACK field value y

Page 41: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

41

© From Computer Networking, by Kurose&Ross Transport Layer 3-81

TCP: retransmission scenarios

lost ACK scenario

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8 bytes of data

Xtimeo

ut

ACK=100

premature timeout

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8bytes of data

timeo

utACK=120

Seq=100, 20 bytes of data

ACK=120

SendBase=100

SendBase=120

SendBase=120

SendBase=92

© From Computer Networking, by Kurose&Ross Transport Layer 3-82

TCP: retransmission scenarios

X

cumulative ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=120, 15 bytes of data

timeo

ut

Seq=100, 20 bytes of data

ACK=120

Page 42: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

42

© From Computer Networking, by Kurose&Ross Transport Layer 3-83

TCP ACK generation [RFC 1122, RFC 2581]

Event at Receiver

Arrival of in-order segment withexpected seq #. All data up toexpected seq # already ACKed

Arrival of in-order segment withexpected seq #. One other segment has ACK pending

Arrival of out-of-order segmenthigher-than-expect seq. # .Gap detected

Arrival of segment that partially or completely fills gap

TCP Receiver action

Delayed ACK. Wait up to 200msfor next segment. If no next segment,send ACK

Immediately send single cumulative ACK, ACKing both in-order segments

Immediately send duplicate ACK, indicating seq. # of next expected byte(if SACK option, also send selective ACK)

Immediately send ACK, provided thatsegment starts at lower end of gap

© From Computer Networking, by Kurose&Ross Transport Layer 3-84

TCP fast retransmit❒  time-out period often

relatively long:❍  long delay before

resending lost packet❒  detect lost segments

via duplicate ACKs❍  sender often sends

many segments back-to-back

❍  if segment is lost, there will likely be many duplicate ACKs

if sender receives 4 ACKs for same data(“triple duplicate ACKs”), resend unacked segment with smallest seq #!  likely that unacked

segment lost, so don’t wait for timeout

TCP fast retransmit

Page 43: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

43

© From Computer Networking, by Kurose&Ross Transport Layer 3-85

X

fast retransmit after sender receipt of triple duplicate ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100tim

eout

ACK=100

ACK=100ACK=100

TCP fast retransmit

Seq=100, 20 bytes of data

Seq=100, 20 bytes of data

Transport Layer 3-86

Error recovery: TCP vs GBN vs SR

❒  Like GBN❍  TCP has cumulative ACKs❍  TCP has a single retransmission timer

❒  Like SR❍  TCP has receiver buffer❍  TCP only retransmits oldest unacked packet on time-out❍  (TCP has a SACK option, similar to Selective Repeat)

❒  In addition❍  TCP has a Fast retransmit mechanism❍  TCP has delayed ACKs

Page 44: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

44

© From Computer Networking, by Kurose&Ross Transport Layer 3-87

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-88

TCP flow controlapplication

process

TCP socketreceiver buffers

TCPcode

IPcode

application

OS

receiver protocol stack

application may remove data from

TCP socket buffers ….

… slower than TCP receiver is delivering

(sender is sending)

from sender

receiver controls sender, so sender won’t overflow receiver’s buffer by transmitting too much, too fast

flow control

Page 45: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

45

© From Computer Networking, by Kurose&Ross Transport Layer 3-89

TCP flow control

buffered data

free buffer space rwnd

RcvBuffer

TCP segment payloads

to application process ❒  receiver “advertises” free

buffer space by including rwnd value in TCP header of receiver-to-sender segments❍  RcvBuffer size set via socket

options (typical default is 4096 bytes)

❍  many operating systems autoadjust RcvBuffer

❒  sender limits amount of unacked (“in-flight”) data to receiver’s rwnd value

❒  guarantees receive buffer will not overflow

receiver-side buffering

Transport Layer 3-90

Flow control: improvementsWays to improve performance:

1. Use larger windows:❍  TCP throughput cannot exceed rcwd / RTT❍  16 bits in TCP header to encode rcwd❍  Max rcwd = 216 bytes, thus max TCP throughput = 64 Kbytes per RTT❍  A TCP option (negotiated during TCP connection establishment) allows to

scale the window size by a factor 2k, with k ≤ 14, thus leading to a max rcwd = 230 bytes (while there are 232 byte numbers)

2. Nagle: Reducing overhead by grouping bytes when sending app writes one byte at a time in socket❍  See next slide

3. Silly window syndrome: reducing overhead when receiving app reads one byte at a time from socket❍  See next slides

Page 46: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

46

Transport Layer 3-91

Nagle algorithm❒  When TCP receives data from socket

one byte at the time, or more generally in units much smaller than the TCP Maximum Segment Size (MSS)❍  send small packet immediately, and buffer

all the rest until the outstanding bytes are acknowledged

❍  send other segments only when all previous bytes are acked (or MSS bytes have been buffered)

❒  Useful for Telnet for exampleOtherwise: ❍  41-byte segments containing 1 byte of

data❍  40 bytes of TCP/IP header overhead

❒  Naggle can be disabled if needed❍  See Socket options: TCP_NoDelay

Host BHost A

small packet with few bytes received

packet with all buffered bytes

Naggle does not allowto send other bytes(unless MSS is reached)

a few bytes

Possible interference between Nagle and delayed ACKs

❒  Issue: additional delay at end of connection Transport Layer 3-92

Host BHost A

MSS packet, odd #

Assume app in B does not write on socket (which would trigger a pkt transmission with a piggybacked ACK)

Send ACK only after 200ms(delayed ACK)

Assume TCP buffer containsslightly more than 1 MSS

(e.g. at end of connection)

Small packet, even #

Naggle does not allow TCPto send the final bytes

(previous bytes not acked)

“Odd numbered” segment,so delayed ack active

Page 47: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

47

Relaxed Naggle❒  To avoid interference with delayed ACK at receiver,

relaxed Naggle will allow to send a packet smaller than MSS if the previous packet was full-size (MSS)

Transport Layer 3-93

Host BHost A

MSS packet, odd #

Delayed ACK enabled,but no consequence asnext segment is receivedimmediately

Assume TCP buffer containsslightly more than 1 MSS

(e.g. at end of connection)

Small packet, even #

Relaxed Naggle allows TCPto send the final bytes

(previous segment was full-size)

© From Computer Networking, by Kurose&Ross Transport Layer 3-94

Silly window syndrome

Receiver’s buffer is full

Application reads 1 byte

Window update segment sent

New byte arrives

Receiver’s buffer is full

Header

Header

Room for one more byte

1 Byte

Still a problemwith applicationsreading one byteat a time

Clark's solution: receiver sends a windowupdate only if the buffer is half empty, or if a full segment can be received

From Computer Networks, by Tanenbaum © Prentice Hall

Page 48: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

48

© From Computer Networking, by Kurose&Ross Transport Layer 3-95

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-96

Connection Managementbefore exchanging data, sender/receiver “handshake”:❒  agree to establish connection (each knowing the other willing to

establish connection)❒  agree on connection parameters (MSS, options)

connection state: ESTAB connection variables:

seq # client-to-server server-to-client rcvBuffer size at server,client

application

network

connection state: ESTAB connection Variables:

seq # client-to-server server-to-client rcvBuffer size at server,client

application

network

Socket clientSocket = newSocket("hostname","port

number");

Socket connectionSocket = welcomeSocket.accept();

Page 49: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

49

© From Computer Networking, by Kurose&Ross Transport Layer 3-97

Q: will 2-way handshake always work in network?

❒  variable delays❒  retransmitted messages (e.g.

req_conn(x)) due to message loss❒  message reordering❒  can’t “see” other side

2-way handshake:

Let’s talk

OKESTAB

ESTAB

choose x req_conn(x)ESTAB

ESTABacc_conn(x)

Agreeing to establish a connection

© From Computer Networking, by Kurose&Ross Transport Layer 3-98

Agreeing to establish a connection2-way handshake failure scenarios:

retransmitreq_conn(x)

ESTAB

req_conn(x)

half open connection!(no client!)

client terminates

serverforgets x

connection x completes

retransmitreq_conn(x)

ESTAB

req_conn(x)

data(x+1)

retransmitdata(x+1)

acceptdata(x+1)

choose xreq_conn(x)

ESTAB

ESTAB

acc_conn(x)

client terminates

ESTAB

choose xreq_conn(x)

ESTABacc_conn(x)

data(x+1) acceptdata(x+1)

connection x completes server

forgets x

Page 50: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

50

© From Computer Networking, by Kurose&Ross Transport Layer 3-99

TCP 3-way handshake

ESTAB

SYNbit=1, Seq=yACKbit=1; ACKnum=x+1

choose init seq num, ysend TCP SYNACKmsg, acking SYN

ACKbit=1, ACKnum=y+1

received SYNACK(x+1) indicates server is live;

send ACK for SYNACK;this segment may contain

client-to-server data with Seq=x+1 received ACK(y+1)

indicates client is live;allocate buffer

SYNSENT

ESTAB

SYN RCVD

client state

CLOSED

server state

LISTEN

If server port not open, TCP server sends back a RST segment

SYNbit=1, Seq=x

choose init seq num, xsend TCP SYN msg

without data;start timer for

SYN retransmission

SYN has a seq number: kind of fictitious first byte

Transport Layer 3-100

TCP 3-way handshake is robust

RetransmitSYN(x)

SYN(x)

retransmitdata(x+1)

Reject ACK: y ≠ y’

client terminates

ESTAB

choose x SYN(x)

ESTAB

SYNACK(y, x+1)

ACK(y+1) and data(x+1)

acceptdata(x+1)

connection completes server

forgets x

SYNACK(y’, x+1)

ACK(y+1) and data(x+1)

choose y

choose y’, must not have been used in recent past!(Here “recent” means a maximum packet life time)

Notations:

SYN(x) means SYNbit = 1 and Seq = x

SYNACK(y,x) means SYNbit = 1 and Seq = y and ACKbit = 1 and ACKnum = x

ACK(y) means ACKbit = 1 and ACKnum = y

Page 51: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

51

© From Computer Networking, by Kurose&Ross Transport Layer 3-101

TCP: choosing initial 32-bit seq#Choosing x and y:❒  Constraint: x and y not used

recently in a former connection between this client (i.e. same IP and port#) and this server (same IP and port#)

❒  Recently = less than TCP maximum segment lifetime (MSL)

❒  Otherwise a still alive duplicate segment (of SYN, SYNACK, ACK) from a former connection could be confused with this connection

❒  Too difficult to keep track of seq# used in each recent connection

❒  In practice, x and y are picked at random by TCP

client

SYN (seq=x)

server

SYNACK (seq=y, ack=x+1)

ACK (seq=x+1, ack=y+1)

connect listen+ accept

connected

connected

© From Computer Networking, by Kurose&Ross Transport Layer 3-102

TCP 3-way handshake: FSM

closed

L

listen

SYNrcvd

SYNsent

ESTAB

Socket clientSocket = newSocket("hostname","port

number");

SYN(seq=x)

Socket connectionSocket = welcomeSocket.accept();

SYN(x)SYNACK(seq=y,ACKnum=x+1)

create new socket for communication back to client

SYNACK(seq=y,ACKnum=x+1)ACK(ACKnum=y+1)ACK(ACKnum=y+1)

Λ

client sideserver side

Page 52: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

52

© From Computer Networking, by Kurose&Ross Transport Layer 3-103

TCP: closing a connection❒  client, server each close their “sending” side of

connection❍  send TCP segment with FIN bit = 1❍  symmetric/graceful release

❒  respond to received FIN with ACK❍  on receiving FIN, ACK can be combined with own FIN

❒  simultaneous FIN exchanges can be handled

© From Computer Networking, by Kurose&Ross Transport Layer 3-104

FIN_WAIT_2

CLOSE_WAIT

FINbit=1, seq=y

ACKbit=1; ACKnum=y+1

ACKbit=1; ACKnum=x+1 wait for server

close

can stillsend data

can no longersend data

LAST_ACK

CLOSED

TIMED_WAIT

timed wait for 2 * max segment lifetime (MSL);

will respond with ACKto received FINs

CLOSED

TCP: closing a connection

FIN_WAIT_1 FINbit=1, seq=xcan no longersend but can receive data

clientSocket.close()

client state server state

ESTABESTAB

FINs have seq# to recover loss of last data bytesACKs are sent when all data received. So there are also retransmissions when FINs are not acked

Last data from client (possibly empty)

Page 53: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

53

Transport Layer 3-105

TCP: closing a connection (flaws)❒  Q: What if last FIN and its retransmissions are all lost?

❒  A: Need to disconnect after k FIN transmission attempts❒  Q: But what about the other side then?

❒  A: Need to add a heart beat mechanism to detect other side has closed “without” notification

❒  Q: But what if the heart beat messages are lost?❒  A: Argh, the other side will close, while it shouldn’t!

❒  Last word: no perfect solution (similar to the Two-army problem)

Transport Layer 3-106

Closing a TCP connection: FSM

ESTABFIN

TIMEWAIT

LASTACK

CLOSED

ACKconnectionSocket.close()

ACKclose socket

Wait 2 * MSL

passive closeactive close

CLOSEWAIT

FIN

FINconnectionSocket.close()

ACK

FIN+ACKFINWAIT

2CLOSING

ACKFIN

ACKFIN

Λ

ACK

Λ

ACK

close socket

simultaneousclose

FINWAIT

1

Side that actively closes socket waits a double MSL (safety margin). Therefore same TCP 4-tuple cannot be reused sooner (e.g. with same client port#). It ensures that no duplicate segment from an earlier connection with the same 4-tuple can jump in the current connection. But this may lead to some starvation of resources.

All pending data are still sent reliably

After a socket “close”, TCP continues to send

the previous bytes reliably

Page 54: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

54

© From Computer Networking, by Kurose&Ross Transport Layer 3-107

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

Transport Layer 3-108

Principles of Congestion Control

❒  informally: “too many sources sending too much data too fast for network to handle”; different from flow control!

❒  manifestations:❍  lost packets (buffer overflow at routers)❍  long delays (queuing in router buffers)

❒  a top-10 problem!

Transmissionrate adjustment

Transmissionnetwork

Small-capacityreceiver

Large-capacityreceiver

InternalcongestionNeed

flowcontrol

Needcongestion

control

From Computer Networks, by Tanenbaum © Prentice Hall

Page 55: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

55

© From Computer Networking, by Kurose&Ross Transport Layer 3-109

Causes/costs of congestion: scenario 1

❒  two senders, two receivers

❒  one router, infinite buffers

❒  output link capacity: R❒  no retransmission

maximum per-connection throughput: R/2

unlimited shared output link buffers

Host A

original data: λin

Host B

throughput: λout

R/2

R/2

λ out

λin R/2 de

lay

λin large delays as arrival rate, λin,

approaches capacity

R

© From Computer Networking, by Kurose&Ross Transport Layer 3-110

❒  one router, finite buffers ❒  sender retransmission of timed-out packet

❍  application-layer input = application-layer output: λin = λout •  “goodput”

❍  transport-layer input includes retransmissions : λ’in ≥ λin

finite shared output link buffers

Host A

λin : original data

Host B

λout λ'in: original data, plus retransmitted data

Causes/costs of congestion: scenario 2

Page 56: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

56

© From Computer Networking, by Kurose&Ross Transport Layer 3-111

Idealization 1: perfect knowledge= sender sends only when router buffers available

finite shared output link buffers

λin : original data λout λ'in: original data, plus

retransmitted data copy

free buffer space!

R/2

R/2

λ out

λin

Causes/costs of congestion: scenario 2

Host B

Host A R

© From Computer Networking, by Kurose&Ross Transport Layer 3-112

λin : original data λout λ'in: original data, plus

retransmitted data copy

no buffer space!

Idealization 2: known loss - packets can be lost, dropped at router due to full buffers- sender only resends if packet known to be lost

Causes/costs of congestion: scenario 2

Host A

Host B

Page 57: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

57

© From Computer Networking, by Kurose&Ross Transport Layer 3-113

λin : original data λout λ'in: original data, plus

retransmitted data

free buffer space!

Causes/costs of congestion: scenario 2 Idealization 2:

known lossR/2

R/2 λin

λ out

when sending at R/2, some packets are retransmissions but asymptotic goodput is still R/2 (why?)

Host A

Host B

R

© From Computer Networking, by Kurose&Ross Transport Layer 3-114

Host A

λin λout λ'in copy

free buffer space!

timeout

R/2

R/2 λin

λ out

When sending at R/2, some packets are retransmissions including duplicates that are delivered! Asymptotic goodput now < R/2

Host B

Realistic: duplicates #  packets can be lost, dropped at

router due to full buffers#  sender times out prematurely,

sending two copies, both of which are delivered

Causes/costs of congestion: scenario 2

Page 58: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

58

© From Computer Networking, by Kurose&Ross Transport Layer 3-115

R/2

λ out

when sending at R/2, some packets are retransmissions including duplicates that are delivered!

“costs” of congestion: #  more work (retransmissions) for given “goodput”#  unneeded retransmissions: link carries multiple copies of

packet! decreasing goodput

R/2 λin

Causes/costs of congestion: scenario 2 Realistic: duplicates #  packets can be lost, dropped

at router due to full buffers#  sender times out

prematurely, sending two copies, both of which are delivered

© From Computer Networking, by Kurose&Ross Transport Layer 3-116

❒  four senders❒  multihop paths❒  timeout/retransmit

Q: what happens as λin and λ’in increase ?

finite shared output link buffers

Host A λout

Causes/costs of congestion: scenario 3

Host B

Host C Host D

λin : original data λ'in: original data, plus

retransmitted data

A: as red λ’in increases, all arriving blue packets at upper queue are dropped, blue throughput ! 0

Page 59: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

59

© From Computer Networking, by Kurose&Ross Transport Layer 3-117

another “cost” of congestion: when packet dropped, any “upstream” transmission capacity used for that packet was wasted!

Causes/costs of congestion: scenario 3

R/2

R/2

λ out

λ’in

© From Computer Networking, by Kurose&Ross Transport Layer 3-118

Approaches towards congestion control

End-end congestion control:

❒  no explicit feedback from network

❒  congestion inferred from end-system observed loss, delay

❒  approach taken by TCP

Network-assisted congestion control:

❒  routers provide feedback to end-systems❍  single bit indicating

congestion (SNA, DECbit, TCP/IP ECN, ATM)

❍  explicit rate for sender to send at (ATM ABR)

❒  not studied in this course

Two broad approaches towards congestion control:

Page 60: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

60

© From Computer Networking, by Kurose&Ross Transport Layer 3-119

Chapter 3 outline

❒  3.1 Transport-layer services

❒  3.2 Multiplexing and demultiplexing

❒  3.3 Connectionless transport: UDP

❒  3.4 Principles of reliable data transfer

❒  3.5 Connection-oriented transport: TCP❍  segment structure❍  reliable data transfer❍  flow control❍  connection management

❒  3.6 Principles of congestion control

❒  3.7 TCP congestion control

© From Computer Networking, by Kurose&Ross Transport Layer 3-120

TCP congestion control:❒  goal: TCP sender should transmit as fast as possible, but

without congesting network❍  Q: how to find rate just below congestion level

❒  decentralized: each TCP sender sets its own rate, based on implicit feedback: ❍ ACK: segment received (a good thing!), network not

congested, so increase sending rate❍  lost segment: assume loss due to congested network,

so decrease sending rate❍  note: loss may also be due to bit errors (e.g. on wireless

links)

Page 61: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

61

© From Computer Networking, by Kurose&Ross Transport Layer 3-121

TCP congestion control: additive increase multiplicative decrease

#  approach: sender increases transmission rate (congestion window size, cwnd), probing for usable bandwidth, until loss occurs!  additive increase: increase cwnd by 1 MSS (Maximum Segment

Size, not counting header) every RTT until loss detected!  multiplicative decrease: cut cwnd in half after loss

cwnd

: TC

P se

nder

co

nges

tion

win

dow

siz

e

AIMD saw toothbehavior: probing

for bandwidth

additively increase window size ……. until loss occurs (then cut window in half)

time

© From Computer Networking, by Kurose&Ross Transport Layer 3-122

TCP Congestion Control: details

❒  sender limits transmission:

❒  cwnd is dynamic, function of perceived network congestion

❒  sender limited to min (cwnd, rwnd)

TCP sending rate:❒  roughly: send cwnd

bytes, wait RTT for ACKs, then send more bytes

last byte ACKed sent, not-

yet ACKed (“in-flight”)

last byte sent

cwnd

LastByteSent - LastByteAcked

< cwnd

sender sequence number space

rate ~ ~ cwnd

RTT bytes/sec

cwndbytes

RTT

ACK(s)

Page 62: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

62

© From Computer Networking, by Kurose&Ross Transport Layer 3-123

TCP Slow Start ❒  when connection begins,

increase rate exponentially until first loss event:❍  initially cwnd = 1 MSS❍  double cwnd every RTT❍  done by incrementing cwnd

for every ACK received❒  summary: initial rate is slow

but ramps up exponentially fast

Host A

one segment

RTT

Host B

time

two segments

four segments

1

2

3 4

cwnd

© From Computer Networking, by Kurose&Ross Transport Layer 3-124

TCP: detecting, reacting to loss❒  TCP RENO: ❒  If loss indicated by 3 duplicate ACKs (mild congestion):

❍  dup ACKs indicate network capable of delivering some segments ❍  Note: can only work if cwnd ≥ 4 MSS❍  cwnd is cut in half, window then grows linearly

❒  Otherwise, loss indicated by timeout (severe congestion):❍  cwnd set to 1 MSS; ❍  window then grows exponentially (as in slow start) to threshold, then

grows linearly

❒  TCP Tahoe (earlier TCP version) sets cwnd to 1 in both cases (timeout or 3 duplicate acks)

Page 63: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

63

© From Computer Networking, by Kurose&Ross Transport Layer 3-125

Q: when should the exponential increase switch to linear?

A: when cwnd gets to 1/2 of its value before timeout

Implementation:❒  variable ssthresh ❒  on loss event, ssthresh is set to 1/2 of cwnd just before loss event❒  linear growth:

❍  cwnd = cwnd + (MSS/cwnd)MSS, for each ACK received

TCP: switching from slow start to CA

© From Computer Networking, by Kurose&Ross Transport Layer 3-126

Summary: TCP Congestion Control

timeout ssthresh = cwnd/2

cwnd = 1 MSS dupACKcount = 0

retransmit missing segment

Λcwnd > ssthresh

congestion avoidance

cwnd = cwnd + MSS (MSS/cwnd) dupACKcount = 0

transmit new segment(s), as allowed

new ACK .

dupACKcount++ duplicate ACK

fast recovery

cwnd = cwnd + MSS transmit new segment(s), as allowed

duplicate ACK

ssthresh= cwnd/2 cwnd = ssthresh + 3 MSS

retransmit missing segment

dupACKcount == 3

timeout ssthresh = cwnd/2 cwnd = 1 MSS dupACKcount = 0 retransmit missing segment

ssthresh= cwnd/2 cwnd = ssthresh + 3 MSS retransmit missing segment

dupACKcount == 3 cwnd = ssthresh dupACKcount = 0

New ACK

slow start

timeout ssthresh = cwnd/2

cwnd = 1 MSS dupACKcount = 0

retransmit missing segment

cwnd = cwnd+MSS dupACKcount = 0 transmit new segment(s), as allowed

new ACK dupACKcount++ duplicate ACK

Λcwnd = 1 MSS

ssthresh = 64 KB dupACKcount = 0

New ACK!

New ACK!

New ACK!

Page 64: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

64

Transport Layer 3-127

TCP “goodput” in steady state❒  Goodput = not counting overhead and retransmitted bytes❒  What’s the average goodput of TCP as a function of window size and RTT?

❍  Ignoring slow start❒  Let W be the window size (measured in MSS) when losses occur❒  When window is W, goodput is W/RTT❒  Just after loss, window drops to W/2, goodput to 0.5 W/RTT ❒  Average goodput: 0.75 W/RTT

Win

dow

siz

e in

MSS W

W/2

W/2 . RTT

3W/4 . W/2MSS percycle

Transport Layer 3-128

TCP goodput in steady state (2)

❒  Average window size (in MSS) = 3W/4❒  Number of MSS per cycle = 3W/4 . W/2 = 3W2/8 = 1/p

❍  Where p is the packet loss ratio•  One packet loss per cycle! (if p small enough)

❍  So

❒  Average goodput (in MSS/sec) = aver. window / RTT = 3W / 4RTT

❒  Average TCP goodput (in bps)

W = 83p

= 32

1RTT p

= 32

MSSRTT p

Page 65: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

65

© From Computer Networking, by Kurose&Ross Transport Layer 3-129

TCP Futures: TCP over “long, fat pipes”

❒  Example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput

❒  Requires average window size W = 83333 in-flight segments!

❒  Throughput in terms of loss rate:

❒  Needs packet loss rate p = 2 . 10-10 Very small !❒  New versions of TCP for high-speed needed!

1.22 ⋅ MSSRTT p

© From Computer Networking, by Kurose&Ross Transport Layer 3-130

Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K

TCP Fairness

TCP connection 1

bottleneckrouter

capacity RTCP connection 2

Page 66: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

66

© From Computer Networking, by Kurose&Ross Transport Layer 3-131

Why is TCP fair?Two competing sessions having the same RTT:❒  Additive increase gives slope of 1, as throughout increases❒  Multiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Con

nect

ion

2 th

roug

hput

1. congestion avoidance: additive increase2. loss: decrease window by factor of 2

3. congestion avoidance: additive increase4. loss: decrease window by factor of 2

Transport Layer 3-132

And when RTTs are different?❒  If RTT of connection 2 = 2 x RTT of connection 1❒  Connection 1 ramps up twice more quickly

R

R

bandwidth share inversely

proportional to RTT

Connection 1 throughput

Con

nect

ion

2 th

roug

hput

Page 67: Introduction to Computer Networking Guy Leduc Chapter 3 ... · Chapter 3 Transport Layer ... protocol available to apps ... when creating datagram to send into UDP socket, app must

67

© From Computer Networking, by Kurose&Ross Transport Layer 3-133

Fairness (more)Fairness and UDP❒  Some multimedia apps

do not use TCP❍  do not want rate throttled

by congestion control❒  Instead use UDP:

❍  pump audio/video at constant rate, tolerate packet loss

❒  UDP is not “TCP friendly”

Fairness and parallel TCP connections

❒  Application can open multiple parallel connections between 2 hosts❍  e.g. Web browsers may do this

❒  Example: link of rate R supporting 9 connections; ❍  if new app opens 1 TCP,

it gets rate R/10❍  if new app opens 10 // TCPs,

it gets more than R/2 !

So ensuring fairness among TCP flows is not enough

Transport Layer 3-134

Chapter 3: Summary❒  principles behind transport layer

services:❍ multiplexing, demultiplexing❍  reliable data transfer❍  flow control❍  congestion control❍  connection management

❒  instantiation and implementation in the Internet❍ UDP❍ TCP

To probe further:❒  other congestion control

algorithms (e.g. CUBIC, BBR)

❒  MPTCP (Multipath TCP)❒  Google’s application

layer transport (QUIC) over UDP

Next:❒  leaving the network

“edge” (application, transport layers)

❒  into the network “core”