99
CSC 600 Internetworking with TCP/IP Unit 3: Transport Layer (Ch. 13, 12) Dr. Cheer-Sun Yang Spring 2001

CSC 600 Internetworking with TCP/IP

Embed Size (px)

DESCRIPTION

CSC 600 Internetworking with TCP/IP. Unit 3: Transport Layer (Ch. 13, 12) Dr. Cheer-Sun Yang Spring 2001. Introduction. Transmission Control Protocol provides connection-oriented reliable transport services. - PowerPoint PPT Presentation

Citation preview

Page 1: CSC 600 Internetworking  with  TCP/IP

CSC 600Internetworking

with TCP/IP

Unit 3: Transport Layer (Ch. 13, 12)

Dr. Cheer-Sun YangSpring 2001

Page 2: CSC 600 Internetworking  with  TCP/IP

Introduction

• Transmission Control Protocol provides connection-oriented reliable transport services.

• User Datagram Protocol (UDP) provides connectionless unreliable transport services.

Page 3: CSC 600 Internetworking  with  TCP/IP

TCP & UDP

• Transmission Control Protocol– Connection oriented– RFC 793

• User Datagram Protocol (UDP)– Connectionless– RFC 768

Page 4: CSC 600 Internetworking  with  TCP/IP
Page 5: CSC 600 Internetworking  with  TCP/IP

Reliable vs. Unreliable

• Reliable transport service handles error recovery at the transport level.

• Unreliable transport service does not provide error recovery at at the transport level.

Page 6: CSC 600 Internetworking  with  TCP/IP

Connection-oriented vs.

Connection-less

• Connection-oriented service must establish connection between the source and the destination first.

• Connection-less service does not establish connection first. It simply does store-and-forward.

Page 7: CSC 600 Internetworking  with  TCP/IP

Properties of the Reliable Delivery Service

• Stream orientation - ordered delivery• Virtual circuit connection – connection

establishment is must prior to segment delivery• Buffered transfer – data buffering is needed• Unstructured stream – TCP segments may not be

as big as a record in a payroll application.• Full duplex connection – Connections provided by

the TCP/IP stream service allow concurrent transfer in both direction.

Page 8: CSC 600 Internetworking  with  TCP/IP

Properties of the Reliable Delivery Service

• TCP provides reliable transport service using sliding window protocol as defined in the Data Link Layer Protocol.

Page 9: CSC 600 Internetworking  with  TCP/IP

Transmission Control Protocol

TCP is a communication protocol, not a piece of software.

Page 10: CSC 600 Internetworking  with  TCP/IP

TCP vs. the Implementation

• TCP is the communication protocol.• TCP is implemented by many venders in

software as part of the Operating System.• The difference between a protocol and the

software that implements it is analogous to the difference between the definition of a programming language and a compiler.

Page 11: CSC 600 Internetworking  with  TCP/IP

What does TCP Specify?

• Data segment format• Timing• Meanings of header fields• Functions of TCP – also referred to as

services provided by TCP

Page 12: CSC 600 Internetworking  with  TCP/IP

What does TCP not specify?

• The user interface is not specified.• The underlying communication system can

be a dialup telephone line, a local area network, a high speed fiber optical network, or a lower speed long haul network.

Page 13: CSC 600 Internetworking  with  TCP/IP

TCP Services• Reliable communication between pairs of processes• Across variety of reliable and unreliable networks and

internets• Two labeling facilities

– Data stream push• TCP user can require transmission of all data up to push flag• Receiver will deliver in same manner• Avoids waiting for full buffers

– Urgent data signal• Indicates urgent data is upcoming in stream• User decides how to handle it

Page 14: CSC 600 Internetworking  with  TCP/IP
Page 15: CSC 600 Internetworking  with  TCP/IP

TCP Header

Page 16: CSC 600 Internetworking  with  TCP/IP
Page 17: CSC 600 Internetworking  with  TCP/IP

Items Passed to IP

• TCP passes some parameters down to IP– Precedence– Normal delay/low delay– Normal throughput/high throughput– Normal reliability/high reliability– Security

Page 18: CSC 600 Internetworking  with  TCP/IP

TCP Header Field

• Port Number– source and destination port numbers (why

source port number?)– why not IP addresses?– Identifies an application– Together with IP address to form an end point

Page 19: CSC 600 Internetworking  with  TCP/IP

TCP Header Field• Sequence Number

– 32 bits long– the range of sequence number is 0 <= seq <= 2 32 -1 – Each sequence number identifies the byte in the stream of data

from the sending TCP to the receiving TCP where the first byte of data is located in the segment

– Initial Sequence Number (ISN) of a connection is set during connection management

1 200 201 400 401 600

segment 1 segment 2 segment 3 (seq = 1) (seq = 201) (seq = 401)

Page 20: CSC 600 Internetworking  with  TCP/IP

TCP Header Field• Acknowledgement Nubmer

– Acknowledgements are piggybacked if there is a segment ready to be sent from the receiver to the sender

– The acknowledgement segment consists of the next sequence number expected

Page 21: CSC 600 Internetworking  with  TCP/IP

TCP Header Field• Header Length

– Why is this needed ?

Page 22: CSC 600 Internetworking  with  TCP/IP

TCP Header Field

Page 23: CSC 600 Internetworking  with  TCP/IP

TCP Header Field• Flags

– URG - if the URG =1, the following bytes contain an urgent message: seq <= urgent message <= seq + urgent pointer

– ACK: acknowledgement number is valid– PSH:

• notification from sender to receiver to force the TCP on the receiver side to pass all data received to the application layer

• Normally sent by the sender when the sender’s buffer is empty so the sender does not wait for more data– RST: Reset the connection– SYN: synchronization request for the sequence number– FIN: Finish flag

Page 24: CSC 600 Internetworking  with  TCP/IP

TCP Header Field• Options:

– End of options: 1 byte – NOP: 1 byte– Maximum segment size: 4 bytes– Window scale factor: 3 bytes

• increases the TCP window size from 16 bits to 32 bits• 1-byte shift count is between 0 and 14• used in the connection establishment for window size negotiation

– Timestamp: 10 bytes • sender places a timestamp in a segment• receiver places an echo reply• this allows the sender to calculate the Round-Trip Time per window

Page 25: CSC 600 Internetworking  with  TCP/IP

TCP Header Field(Options)End of options

NOP

MSS

Window scale factor

Timestamp

0

1

2 4

3 3 S S: shift count

8 10 timestamp timestamp echo reply

Page 26: CSC 600 Internetworking  with  TCP/IP

Transport Layer Issues• Addressing• Connection establishment • Connection termination • Flow Control• Timeout and retransmission• Congestion Control• Multiplexing• Duplication detection• Crash recovery

Page 27: CSC 600 Internetworking  with  TCP/IP

TCP Mechanisms

• Connection establishment• Data transfer• Send policy• Deliver policy• Accept policy: in-order, in-window• Retransmission policy: first-only, batch,

individual• Acknowledgement Policy

Page 28: CSC 600 Internetworking  with  TCP/IP

Addressing• Target user specified by:

– User identification• Usually host, port

– Called a socket in TCP• Port represents a particular transport service (TS) user

– Transport entity identification• Generally only one per host• If more than one, then usually one of each type

– Specify transport protocol (TCP, UDP)

– Host address• An attached network device• In an internet, a global internet address

– Network number

Page 29: CSC 600 Internetworking  with  TCP/IP

Finding Addresses

• Four methods– Know address ahead of time

• e.g. collection of network device stats– Well known addresses– Name server– Sending process request to well known address

Page 30: CSC 600 Internetworking  with  TCP/IP

Ports, Connections, and Endpoints

• TCP uses the connection, not the protocol port, as its fundamental abstraction; connections are identified by a pair of endpoints, i.e., (18.26.0.36, 1069) and (128.10.2.3, 25).

• An endpoint is a pair of integers = (host, port).• Because TCP identifies a connection by a pair of

endpoints, a given TCP port number can be shared by multiple connections on the same machine.

Page 31: CSC 600 Internetworking  with  TCP/IP

Connection Establishment

• Connection establishment– Three way handshake– Between pairs of ports– One port can connect to multiple destinations

Page 32: CSC 600 Internetworking  with  TCP/IP
Page 33: CSC 600 Internetworking  with  TCP/IP

Passive and Active Opens

• A client requests for a connection – an active open request.

• A server must be waiting for the request for connection – a passive open.

Page 34: CSC 600 Internetworking  with  TCP/IP

Connection Establishment• Two way handshake

– A send SYN, B replies with SYN– Lost SYN handled by re-transmission

• Can lead to duplicate SYNs– Ignore duplicate SYNs once connected

• Lost or delayed data segments can cause connection problems– Segment from old connections– Start segment numbers fare removed from previous connection

• Use SYN i• Need ACK to include i• Three Way Handshake

Page 35: CSC 600 Internetworking  with  TCP/IP

Two Way Handshake:Obsolete

Data Segment

Page 36: CSC 600 Internetworking  with  TCP/IP

Two Way Handshake:Obsolete SYN Segment

Page 37: CSC 600 Internetworking  with  TCP/IP

Three WayHandshake:Examples

Page 38: CSC 600 Internetworking  with  TCP/IP

Connection Establishment

Page 39: CSC 600 Internetworking  with  TCP/IP

Three Way Handshake:

State Diagram

Page 40: CSC 600 Internetworking  with  TCP/IP
Page 41: CSC 600 Internetworking  with  TCP/IP

Initial Sequence Number• When a new connection is being established, the

SYN flag is turned on. The sequence number field contains the ISN chosen by the host for this connection.

• The sequence number of the first byte of data sent by the host will be the ISN plus one because the SYN flag consumes a sequence number.

Page 42: CSC 600 Internetworking  with  TCP/IP

Connection Termination• Entity in CLOSE WAIT state sends last data segment,

followed by FIN• FIN arrives before last data segment• Receiver accepts FIN

– Closes connection– Loses last data segment

• Associate sequence number with FIN• Receiver waits for all segments before FIN sequence

number• Loss of segments and obsolete segments

– Must explicitly ACK FIN

Page 43: CSC 600 Internetworking  with  TCP/IP

Data Transfer

• Data transfer– Logical stream of octets– Octets numbered modulo 223

– Flow control by credit allocation of number of octets

– Data buffered at transmitter and receiver

Page 44: CSC 600 Internetworking  with  TCP/IP

Send Policy

• If no push or close TCP entity transmits at its own convenience

• Data buffered at transmit buffer• May construct segment per data batch• May wait for certain amount of data

Page 45: CSC 600 Internetworking  with  TCP/IP

Deliver Policy

• In absence of push, deliver data at own convenience

• May deliver as each in order segment received

• May buffer data from more than one segment

Page 46: CSC 600 Internetworking  with  TCP/IP

Accept Policy

• Segments may arrive out of order• In order

– Only accept segments in order– Discard out of order segments

• In windows– Accept all segments within receive window

Page 47: CSC 600 Internetworking  with  TCP/IP

Not Listening

• Reject with RST (Reset)• Queue request until matching open issued• Signal TS user to notify of pending request

– May replace passive open with accept

Page 48: CSC 600 Internetworking  with  TCP/IP

Connection Termination

• Connection termination– Graceful close– TCP users issues CLOSE primitive– Transport entity sets FIN flag on last segment

sent– Abrupt termination by ABORT primitive

• Entity abandons all attempts to send or receive data• RST segment transmitted

Page 49: CSC 600 Internetworking  with  TCP/IP

Termination

• Either or both sides• By mutual agreement• Abrupt termination• Or graceful termination

– Close wait state must accept incoming data until FIN received

Page 50: CSC 600 Internetworking  with  TCP/IP
Page 51: CSC 600 Internetworking  with  TCP/IP

Side Initiating Termination

• TS user Close request• Transport entity sends FIN, requesting

termination• Connection placed in FIN WAIT state

– Continue to accept data and deliver data to user– Not send any more data

• When FIN received, inform user and close connection

Page 52: CSC 600 Internetworking  with  TCP/IP

Side Not Initiating Termination• FIN received• Inform TS user Place connection in CLOSE WAIT state

– Continue to accept data from TS user and transmit it• TS user issues CLOSE primitive• Transport entity sends FIN• Connection closed

• All outstanding data is transmitted from both sides• Both sides agree to terminate

Page 53: CSC 600 Internetworking  with  TCP/IP
Page 54: CSC 600 Internetworking  with  TCP/IP

Usage of tcpdump

• A program called tcpdump on taz.cs.wcupa.edu has been installed for monitoring TCP mechanisms.

• It requires root privilege. So Dr. Kline set up a script called TCPDUMP for us to run tcpdump.

• For details, see homework sheet.

Page 55: CSC 600 Internetworking  with  TCP/IP

Output of tcpdump• On taz.cs.wcupa.edu, each segment sent is printed

out twice. It looks odd. • TCPDUMP prints out each segment in the

following format: source > destination: flags, where the flags represents S(SYN), F(FIN), R(RST), P(PSH), and a dot(.).

• The sequence numbers are followed by the number of data bytes. For example: 1415531521:1415531521(0) is a segment without data.

Page 56: CSC 600 Internetworking  with  TCP/IP

Output of tcpdump• Option fields are printed out.• MSS - maximum segment size• WSCALE: window scale• NOP: no operation (used for padding a field

length to a multiple of four bytes).• <mss 512,nop,wscale 0,nop,nop,timestamp

146647 0>

Page 57: CSC 600 Internetworking  with  TCP/IP

Flow Control

• Longer transmission delay between transport entities compared with actual transmission time– Delay in communication of flow control info

• Variable transmission delay– Difficult to use timeouts

• Flow may be controlled because:– The receiving user can not keep up– The receiving transport entity can not keep up

• Results in buffer filling up

Page 58: CSC 600 Internetworking  with  TCP/IP

The idea Behind Sliding Windows

• A simple positive acknowledgement protocol wastes a substantial amount of network bandwidth because it must delay sending a new packet until it receives an acknowledgement for the previous packet.

Page 59: CSC 600 Internetworking  with  TCP/IP
Page 60: CSC 600 Internetworking  with  TCP/IP
Page 61: CSC 600 Internetworking  with  TCP/IP
Page 62: CSC 600 Internetworking  with  TCP/IP

Window Size and Flow Control

• TCP allows the window size to be changed over time.

• Each ACK, which specifies how many octets have been received, contains a window advertisement that specifies how many additional octets of data the receiver is prepared to receive.

• It is the receiver’s current buffer size.

Page 63: CSC 600 Internetworking  with  TCP/IP

Coping with Flow Control Requirements (1)

• Do nothing– Segments that overflow are discarded– Sending transport entity will fail to get ACK

and will retransmit• Thus further adding to incoming data

• Refuse further segments– Clumsy– Multiplexed connections are controlled on

aggregate flow

Page 64: CSC 600 Internetworking  with  TCP/IP

Coping with Flow Control Requirements (2)

• Use fixed sliding window protocol– See chapter 7 for operational details– Works well on reliable network

• Failure to receive ACK is taken as flow control indication

– Does not work well on unreliable network• Can not distinguish between lost segment and flow

control

• Use credit scheme

Page 65: CSC 600 Internetworking  with  TCP/IP

Credit Scheme

• Greater control on reliable network• More effective on unreliable network• Decouples flow control from ACK

– May ACK without granting credit and vice versa

• Each octet has sequence number• Each transport segment has seq number, ack

number and window size in header

Page 66: CSC 600 Internetworking  with  TCP/IP

Use of Header Fields

• When sending, seq number is that of first octet in segment

• ACK includes AN=i, W=j• All octets through SN=i-1 acknowledged

– Next expected octet is i• Permission to send additional window of

W=j octets– i.e. octets through i+j-1

Page 67: CSC 600 Internetworking  with  TCP/IP

Credit Allocation

Page 68: CSC 600 Internetworking  with  TCP/IP

Sending and Receiving Perspectives

Page 69: CSC 600 Internetworking  with  TCP/IP

Unreliable Network Service

• E.g. – internet using IP, – frame relay using LAPF– IEEE 802.3 using unacknowledged

connectionless LLC• Segments may get lost• Segments may arrive out of order

Page 70: CSC 600 Internetworking  with  TCP/IP

Ordered Delivery

• Segments may arrive out of order• Number segments sequentially• TCP numbers each octet sequentially• Segments are numbered by the first octet

number in the segment

Page 71: CSC 600 Internetworking  with  TCP/IP

Retransmission Strategy

• Segment damaged in transit• Segment fails to arrive• Transmitter does not know of failure• Receiver must acknowledge successful

receipt• Use cumulative acknowledgement• Time out waiting for ACK triggers

re-transmission

Page 72: CSC 600 Internetworking  with  TCP/IP

Timer Value• Fixed timer

– Based on understanding of network behavior– Can not adapt to changing network conditions– Too small leads to unnecessary re-transmissions– Too large and response to lost segments is slow– Should be a bit longer than round trip time

• Adaptive scheme– May not ACK immediately– Can not distinguish between ACK of original segment and re-

transmitted segment– Conditions may change suddenly

Page 73: CSC 600 Internetworking  with  TCP/IP

TCP Timers

• Retransmission Timer: started during a transmission. A timeout causes a retransmission.

• Persist Timer: ensures that window size information is transmitted even if no data is transmitted.

• Keepalive Timer: detects crashes on the other end of connection.

• Other Timers: delay ACK timer, timeout of connection setup, abort timeout, 2MSL(Maximum Segment Lifetime) timeout(closing timeout).

Page 74: CSC 600 Internetworking  with  TCP/IP

Acknowledgement Policy

• Immediate• Cumulative

Page 75: CSC 600 Internetworking  with  TCP/IP

Congestion Control

• RFC 1122, Requirements for Internet hosts• Retransmission timer management

– Estimate round trip delay by observing pattern of delay– Set time to value somewhat greater than estimate– Simple average– Exponential average– RTT Variance Estimation (Jacobson’s algorithm)

Page 76: CSC 600 Internetworking  with  TCP/IP

Congestion Control Avoidance

• TCP must remember the size of the receiver’s window. To control congestion, TCP maintains a second limit, called the congestion window limit, that is used to restrict data flow to less than the receiver’s buffer size.

• Multiplicative Decrease Congestion Avoidance: To estimate congestion window size, TCP assumes that most datagram loss comes from congestion and

Page 77: CSC 600 Internetworking  with  TCP/IP

Congestion Control Avoidance

• Upon loss of a segment, the sender reduces the congestion window by half. For those segments tha remain in the allowed window, backoff retransmission timer exponentially.

• Slow Start: When congestion ends, increase the congestion window exponentially until it reaches the receiver’s window limit.

• The term slow start is a misnomer since the congestion window grows exponentially.

Page 78: CSC 600 Internetworking  with  TCP/IP

Congestion Control• Slow start

– awnd = MIN[credit, cwnd]– Start connection with cwnd=1– Increment cwnd at each ACK, to some max

• Dynamic windows sizing on congestion– When a timeout occurs– Set slow start threshold to half current congestion window

• ssthresh=cwnd/2– Set cwnd = 1 and slow start until cwnd=ssthresh

• Increasing cwnd by 1 for every ACK– For cwnd >=ssthresh, increase cwnd by 1 for each RTT

Page 79: CSC 600 Internetworking  with  TCP/IP

Response to Congestion

• How can a router avoid global congestion?– Tail drop: if the input queue is fulled when a

datagram arrives, discard the datagram.– Random Early Discard(RED)

Page 80: CSC 600 Internetworking  with  TCP/IP

RED• A router uses two threshold values to mark positions in the

queue: Tmin and Tmax. The general operation of RED can be described by three rules that determine the disposition of each arriving datagram:– if the queue currently contains fewer than Tmin datagrams,

add the new datagram to the queue.– If the queue contains more than Tmax datagrams, discard the

new datagram.– If the queue contains between Tmin and Tmax datagrams,

randomly discard the datagram according to a probability p.

Page 81: CSC 600 Internetworking  with  TCP/IP

Timeout and Retransmit

• TCP maintains queue of segments transmitted but not acknowledged

• TCP will retransmit if not ACKed in given time

• Measurements of round trip times vary dramatically over time.

Page 82: CSC 600 Internetworking  with  TCP/IP
Page 83: CSC 600 Internetworking  with  TCP/IP

Exponential RTO Backoff

• Since timeout is probably due to congestion (dropped packet or long round trip), maintaining a constant RTT is not a good idea

• RTT increased each time a segment is re-transmitted

Page 84: CSC 600 Internetworking  with  TCP/IP
Page 85: CSC 600 Internetworking  with  TCP/IP

Timeout and Retransmit

• Adaptive retransmission algorithm(RFC 793)

RTT = * old RTT + (1 - ) * new Round Trip Sample

RTO = * RTT (usually = 2)

Page 86: CSC 600 Internetworking  with  TCP/IP

Use of Exponential Averaging

Page 87: CSC 600 Internetworking  with  TCP/IP

Responding to High Variance in Delay - Jacobson’s Algorithm

DIFF = Sample - old RTT

smoothed RTT = old RTT + * DIFF

mean deviation = old mean deviation + (|DIFF| - old mean deviation)

RTO = smoothed RTT + * mean deviation

: between 0 and 1

: inverse of a power of 2

: inverse of a power of 2

Page 88: CSC 600 Internetworking  with  TCP/IP
Page 89: CSC 600 Internetworking  with  TCP/IP

Karn’s Algorithm

• If a segment is re-transmitted, the ACK arriving may be:– for the first copy of the segment, then RTT longer than

expected– for second copy, then RTT shorter than expected

• No way to tell• Do not measure RTT for re-transmitted segments• Calculate backoff when re-transmission occurs• Use backoff RTO until ACK arrives for segment that

has not been re-transmitted

Page 90: CSC 600 Internetworking  with  TCP/IP

Silly Window Syndrome• A problem occurs when the sender and the receiver

operate at different speeds.• When the receiving application reads an octet of

data from a full buffer, one octet of space becomes available. The TCP on the receiver generates a segment the inform the sender that 1 octet is available.

• The sender sends out a segment of one byte.• This results in a series of small data segment - silly

window syndrome (SWS).

Page 91: CSC 600 Internetworking  with  TCP/IP

Silly Window Syndrome Avoidance-Nagle Algorithm

• Receiver-side avoidance : delay acknowledgement• Sender-side avoidance: delay transmission

adaptively.

Page 92: CSC 600 Internetworking  with  TCP/IP

Multiplexing

• Multiple users employ same transport protocol• User identified by port number or service access

point (SAP)• May also multiplex with respect to network

services used– e.g. multiplexing a single virtual X.25 circuit to a

number of transport service user• X.25 charges per virtual circuit connection time

Page 93: CSC 600 Internetworking  with  TCP/IP

Duplication Detection

• If a segment is lost and retransmitted, no confusion will result.

• If, however, an ACK is lost, one or more segments will be retransmitted and, if they arrive successfully, the receiver must be able to recognizes duplicates.

Page 94: CSC 600 Internetworking  with  TCP/IP

Duplication Detection• Duplicate received prior to closing connection

– Receiver assumes ACK lost and ACKs duplicate– Sender must not get confused with multiple ACKs– Sequence number space large enough to not cycle

within maximum life of segment• Duplicate received after closing connection

Page 95: CSC 600 Internetworking  with  TCP/IP

Crash Recovery

• After restart all state info is lost• Connection is half open

– Side that did not crash still thinks it is connected• Close connection using persistence timer

– Wait for ACK for (time out) * (number of retries)– When expired, close connection and inform user

• Send RST i in response to any i segment arriving• User must decide whether to reconnect

– Problems with lost or duplicate data

Page 96: CSC 600 Internetworking  with  TCP/IP

UDP

• User datagram protocol• RFC 768• Connectionless service for application level

procedures– Unreliable– Delivery and duplication control not guaranteed

• Reduced overhead• e.g. network management (Chapter 19)

Page 97: CSC 600 Internetworking  with  TCP/IP

UDP Uses

• Inward data collection• Outward data dissemination• Request-Response• Real time application

Page 98: CSC 600 Internetworking  with  TCP/IP

UDP Header

Page 99: CSC 600 Internetworking  with  TCP/IP

Recommended Reading

• Comer: chapter 12, chapter 13• Stallings: chapter 17• RFCs