Upload
jerome-hudson
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
05. Apr. 2006 1 INF-3190: Transport Layer
Transport Layer
Foreleser: Carsten GriwodzEmail: [email protected]
05. Apr. 2006 2 INF-3190: Transport Layer
Transport Layer Function Formal tasks of the transport layer
Addressing Machines and processes
Transparent data transfer between end points Hiding network topology Hiding specific network peculiarities De-/multiplexing
Quality of service Error recovery Reliability Flow control
Hiding end-system speed differences Link layer flow control between direct neighbors
Congestion control Adapt speed to end-to-end network capabilities
… End-to-end connection management
05. Apr. 2006 3 INF-3190: Transport Layer
Transport Service: Terminology Entities exchanged
TCP/IP: Message = ISO: TPDU - Transport Protocol Data Unit
Nesting of messages, packets, and frames
LayerTransportNetworkData linkPhysical
Data UnitMessagePacketFrame
Bit/byte (bitstream)
Message Payload
Packet Payload
Frame Payload
Message headerPacket header
Frame header
05. Apr. 2006 4 INF-3190: Transport Layer
Transport Service Connection oriented
service 3 phases
connection set-up data transfer disconnect
Connectionless service Transfer of isolated units
Realization: transport entity Software and/or hardware Software part usually
contained within the kernel (process, library)
Layer 4 hardware support: spurious
1-2
3
4
5Application
Layer
TransportEntity
NetworkLayer
ApplicationLayer
TransportEntity
NetworkLayer
Service Interface
TransportProtocolPort
IP: Message
TCP/IP: Port =ISO: TSAP - Transport Service Access Point
05. Apr. 2006 5 INF-3190: Transport Layer
Transport Service Similar services of
Network layer and transport layer Why 2 Layers?
Network service Not to be self-governed or influenced by the user Independent from application & user
enables compatibility between applications Provides for example
“only” connection oriented communications or “only” unreliable data transfer
Transport service To improve the network services that
users and higher layers want to get from the network layer, e.g.
reliable service necessary time guarantees
End system End system
4
3
2
1
5
3
2
1
Intermediate system
05. Apr. 2006 6 INF-3190: Transport Layer
Transport Service Transport layer
Isolates upper layers from technology, design and imperfections of subnet
Traditionally distinction made in TCP/IP between Layers 1 – 4
transport service provider Layers above 4
transport service user
Transport layer has key role Major boundary between provider and user of reliable
data transmission service
05. Apr. 2006 7 INF-3190: Transport Layer
Transport Service Transport protocols of TCP/IP protocols
Services provided implicitly (ISO protocols offer more choice)
TCPUDP SCTPDCCPConnection-oriented serviceConnectionless serviceOrdered
ReliableUnordered
UnreliableWith congestion controlWithout congestion controlMulticast supportMultihoming support
X X XX
X X
X X XX X
X X XX X X
XX X
X
Partially Ordered X
Partially Reliable X
05. Apr. 2006 8 INF-3190: Transport Layer
Addressing at the Transport Layer
Applications … … require communication … communicate
locally by interprocess communication between system via transport services
Transport layer Interprocess communication via communication networks
Internet Protocol IP Enables endsystem-to-endsystem communication Not application to application
Telnetclient
Telnetserver
FTPclient
FTPserver
Webclient
Webserver
Transport
Network
Data link
Physical
05. Apr. 2006 9 INF-3190: Transport Layer
Addressing at the Transport Layer
Transport address different from network address Sender process must address receiver process Receiver process can be approached by the sender process
1-2
3
4
5
TransportEntity
NetworkLayer
TransportEntity
NetworkLayer
Processes
Transport addresses
Network addresses
05. Apr. 2006 10 INF-3190: Transport Layer
Addressing at the Transport Layer
TCP and UDP have their own assignments this table shows some examples for TCP (read /etc/services for more)
Decimal
Keyword UNIX keyword Description
20 FTP-DATA ftp-data File transfer protocol (data)21 FTP ftp File transfer protocol (control)22 SSH ssh Secure shell23 TELNET telnet Terminal Connections25 SMTP smtp Simple mail transfer protocol37 TIME time Time42 NAMESERVE
Rname Host name server
53 DOMAIN nameserver Domain Name Server80 HTTP HTTP World Wide Web110 POP3 pop3 Remote Email Access111 SUN RPC sunrpc SUN Remote Procedure Call119 NNTP nntp USENET News Transfer Protocol
05. Apr. 2006 11 INF-3190: Transport Layer
Multiplexing task of the Transport Layer
Multiplexing and demultiplexing task of the transport layer Example: accessing a web page with video element
Three protocols used (minor simplification) HTTP for web page RTSP for video control RTP for video data
1-2
3
4
5
TCP
NetworkLayer
Webserver
Videoserver
UDP
Webbrowser
Videoplugin
TCP
NetworkLayer
UDP
port 80
port 554
portsdynamically
chosen
05. Apr. 2006 12 INF-3190: Transport Layer
Multiplexing task of the Transport Layer
Multiplexing and demultiplexing task of the transport layer Example: accessing a web page with video element
Three protocols used (minor simplification) HTTP for web page RTSP for video control RTP for video data
1-2
3
4
5
TCP
IP addr 1
Webserver
Videoserver
UDP
Webbrowser
Videoplugin
TCP
IP addr 2
UDPmultiplexing demultiplexing
same IP address forall services
05. Apr. 2006 13 INF-3190: Transport Layer
Transport Service Transport protocols of TCP/IP protocols
Services provided implicitly (ISO protocols offer more choice)
TCPUDP SCTPDCCPConnection-oriented serviceConnectionless serviceOrdered
ReliableUnordered
UnreliableWith congestion controlWithout congestion controlMulticast supportMultihoming support
X X XX
X X
X X XX X
X X XX X X
XX X
X
Partially Ordered X
Partially Reliable X
05. Apr. 2006 14 INF-3190: Transport Layer
UDP: Message Format
• Source port• Optional• 16 bit sender identification• Response may be sent there
• Destination port• Receiver identification
• Packet length• In byte (including UDP header)• Minimum: 8 (byte)
i.e. header without data
• Checksum• Optional• Checksum of header and data
for error detection
Destination AddressSource address
Time to live Protocol Header checksumIdentification DM Fragment offset
Version IHL Type of service Total lengthPRE ToS
Data
OptionsSource port Destination port
IP header
UDP headerPacket length ChecksumUsed for demultiplexing:
service address
05. Apr. 2006 15 INF-3190: Transport Layer
TCP: Message Format• TCP/IP Header Format
Destination AddressSource address
Time to live Protocol Header checksumIdentification DM Fragment offset
Version IHL Type of service Total lengthPRE ToS
Data
OptionsSource port Destination port
Sequence numberPiggyback acknowledgement
THL F WindowSRPAUunusedChecksum Urgent pointer
Options (0 or more 32 bit words)
IP header
TCP header
Used for demultiplexing:identifies connection
Used for demultiplexing:service address for connection setup
05. Apr. 2006 17 INF-3190: Transport Layer
UDP - User Datagram Protocol• Specification
• RFC 768
• UDP is a simple transport protocol• Unreliable• Connectionless• Message-oriented
• UDP is mostly IP with short transport header• Source and destination port• Ports allow for dispatching of messages to receiver process
05. Apr. 2006 18 INF-3190: Transport Layer
UDP Characteristics• No flow control
• Application may transmit as fast as it can / want and the network permits
• No error control or retransmission• No guarantee about packet sequencing• Packet delivery to receiver not ensured• Possibility of duplicated packets
• May be used with broadcast / multicasting and streaming
05. Apr. 2006 19 INF-3190: Transport Layer
UDP: Message Format
• Source port• Optional• 16 bit sender identification• Response may be sent there
• Destination port• Receiver identification
• Packet length• In byte (including UDP header)• Minimum: 8 (byte)
i.e. header without data
• Checksum• Optional• Checksum of header and data
for error detection
Destination AddressSource address
Time to live Protocol Header checksumIdentification DM Fragment offset
Version IHL Type of service Total lengthPRE ToS
Data
OptionsSource port Destination port
IP header
UDP headerPacket length Checksum
05. Apr. 2006 20 INF-3190: Transport Layer
UDP: Message Format – Checksum
• Purpose• Error detection (header and data)
• Algorithm• One’s complement of sum of 16-bit halfwords in one’s
complement arithmetic• UDP checksum includes
• UDP header (checksum field initially set to 0)• Data• Pseudoheader
• Part of IP header• source IP address• destination IP address• Protocol• length of (UDP) data
• Allows to detect misdelivered UDP messages• Use of checksum optional
• i.e., if checksum contains only "0"s, it is not used• Transmit 0xFFFF if calculated checksum is 0
Destination AddressSource address
00000000 Protocol=17 UDP segment length
05. Apr. 2006 21 INF-3190: Transport Layer
UDP: Ranges of Application• Benefits of UDP
• Needs very few resources (in endsystems, in each data unit)• No connection establishment (hence, no overhead, faster
transmission)• Simple implementation possible (e.g. suitable for boot PROM)• Applications can precisely control
• Packet flow• Error handling / reliability• Timing
• Problems of UDP• No flow control
• Packets can be lost because receiving application is too slow• No congestion control
• No reaction to network overload
05. Apr. 2006 22 INF-3190: Transport Layer
UDP: Ranges of Application• Suitable
• For simple client-server interactions, i.e. typically• 1 request packet from client to server• 1 response packet from server to client
• When delay is worse than packet loss and duplication• Video conferencing• IP telephony• Gaming
• Used by e.g.• DNS: Domain Name Service ¹• SNMP: Simple Network Management Protocol• BOOTP: Bootstrap protocol• TFTP: Trivial File Transfer Protocol• NFS: Network File System ¹• NTP: Network Time Protocol ¹• RTP: Real-time Transport Protocol ¹
¹ can also be used with TCP
05. Apr. 2006 24 INF-3190: Transport Layer
TCP - Transmission Control Protocol TCP is the main transport protocol of the Internet
History RFC 793: originally RFC 1122 and RFC 1323: errors corrected, enhancements
implemented
Motivation: network with connectionless service Packets and messages may be
duplicated, in wrong order, faulty i.e., with such service only, each application would have to provide recovery
error detection and correction Network or service can
impose packet length define additional requirements to optimize data transmission i.e., application would have to be adapted
TCP provides Reliable end-to-end byte stream over an unreliable network service
05. Apr. 2006 25 INF-3190: Transport Layer
What is TCP? TCP is
A transport protocol specification Not a piece of software
TCP specifies Data and control information formats Procedures for
flow control error detection and correction connect and disconnect
As a primary abstraction a connection not just the relationships of ports (as a queue, like UDP)
TCP does not specify The interface to the application (sockets, streams) Interfaces are specified separately: e.g. Berkeley Socket API,
WINSOCK
05. Apr. 2006 26 INF-3190: Transport Layer
TCP Characteristics Data stream oriented
TCP transfers serial byte stream Maintains sequential order
Unstructured byte stream Application often has to transmit more structured data TCP does not support such groupings into (higher) structures
within byte stream Buffered data transmission
Byte stream not message stream: message boundaries are not preserved
no way for receiver to detect the unit(s) in which data were written
For transmission the sequential data stream is Divided into segments Delayed if necessary (to collect data)
D A B C D
IP header TCP header
data from / to TCP applicationdata sent via IP
A B C
WRITE / READ call
05. Apr. 2006 27 INF-3190: Transport Layer
TCP Characteristics Virtual connection
Connection established between communication parties before data transmission
Two-way communications (fully duplex) Data may be transmitted simultaneously in both directions
over a TCP connection
Point-to-point Each connection has exactly two endpoints
Reliable Fully ordered, fully reliable
Sequence maintained No data loss, no duplicates, no modified data
05. Apr. 2006 28 INF-3190: Transport Layer
TCP Characteristics• Error detection
• Through checksum• Piggybacking
• Control information and data can be transmitted within the same segment
• Out-of-band data: expedited data• Important information sent to receiver• i.e. should get to receiver’s application before data that was
sent earlier• OOB operation
• Data is not stored in a buffer• Sent immediately and immediately made available to application
on receiver’s end• example <CR> end of line at terminal emulation
• Urgent flag• Send and transfer data to application immediately
• example <Crtl C>arrival interrupts receiver’s application
05. Apr. 2006 29 INF-3190: Transport Layer
TCP Characteristics• No broadcast
• No possibility to address all applications• With connect, however, not necessarily sensible
• No multicasting• Group addressing not possible
• No QoS parameters• Not suited for different media characteristics
• No real-time support• No correct treatment / communications of audio or video
possible• E.g. no forward error correction
05. Apr. 2006 30 INF-3190: Transport Layer
TCP in Use & Application AreasBenefits of TCP Reliable data transmission
Efficient data transmission despite complexity Can be used with LAN and WAN for
low data rates (e.g. interactive terminal) and high data rates (e.g. file transfer)
Disadvantages when compared with UDP Higher resource requirements
buffering, status information, timer usage Connection set-up and disconnect necessary
even with short data transmissions
Applications File transfer (FTP) Interactive terminal (Telnet) E-mail (SMTP) X-Windows
05. Apr. 2006 31 INF-3190: Transport Layer
Connection – Addressing
Passive open Process indicates that it would accept connect request
Active open Process requests a connection
Addressing Port number + protocol identification
Identifies entity in the end system IP address
Identifies end system
ProcessProcess PortPort
Connection
05. Apr. 2006 32 INF-3190: Transport Layer
Connection – Addressing• TCP service obtained via service endpoints on sender and
receiver• Typically socket• Socket number consists of
• IP address of host and• 16-bit local number (port)
• Transport Service Access Point• Port
• TCP connection is clearly definedby a quintuple consisting of
• IP address of sender and receiver• Port address of sender and receiver• TCP protocol identifier
• Applications can use the same localports for several connections
2
3
1
4
1.1.1.1 2.2.2.2
3.3.3.3
Server offers (IP addr/port/TCPid)(1.1.1.1/1/6)
(IP addr sender/port sender/ IP addr recv/port recv/TCPid)(3.3.3.3/4/1.1.1.1/1/6)
(2.2.2.2/3/1.1.1.1/1/6)
(2.2.2.2/2/1.1.1.1/1/6)
05. Apr. 2006 34 INF-3190: Transport Layer
Connection Establishment One passive & one active side
Server: wait for incoming connection using LISTEN and ACCEPT
Client: CONNECT (specifying IP addr. and port, max. TCP segment size)
Three-Way-Handshake Connecting through 3 packets
Comments When establishing a
connection initial sequence numbers of
both partners are also exchanged and acknowledged
initial seq.# is not 0 SYN segment consumes one
byte of sequence space in order to be acknowledged
unambiguously
send SYN(SEQ=x)
receiveSYN+ACK
sendSYN(SEQ=x+1)
ACK(SEQ=y+1)
Host 1Client
Host 2Server
receive SYN
send SYN(SEQ=y)ACK(SEQ=x+1)
receiveSYN+ACK
time time
05. Apr. 2006 35 INF-3190: Transport Layer
Connection Establishment Comments
If on server side no process is waiting on port (no process did LISTEN)
Reply packet with RST bit set is sent to reject connection attempt
Process listening on port may accept or reject
send SYN(SEQ=x)
Host 1Client
Host 2No server
receive SYN
send RST
time time
05. Apr. 2006 36 INF-3190: Transport Layer
Connection Establishment Call collision
Still only one single connection will be established even when
both partners actively try to establish a connection simultaneously
send SYN(SEQ=x)
receiveSYN+ACK
sendSYN(SEQ=x)
ACK(SEQ=y+1)
Host 1Client &Server
Host 2Client &Server
receive SYN
send SYN(SEQ=y)ACK(SEQ=x+1)
receiveSYN+ACK
time time
send SYN(SEQ=y)
receive SYN
05. Apr. 2006 37 INF-3190: Transport Layer
Connection Release Connection release for pairs of simplex connections
each direction is released independently of the other
Connection release by either side sending a segment with FIN bit set
no more data to be transmitted when FIN is acknowledged, this direction is shut down for new
data
Directions are released independently other direction may still be open full release of connection if both directions have been shut
down
05. Apr. 2006 38 INF-3190: Transport Layer
Connection Release Systematic disconnect by 4
packets between 2nd and 3rd
host 2 can still send data to host 1
3 packets possible first ACK and second FIN may
be contained in same segment
Connection interrupt: Opposite side cannot transmit data anymore
immediate acknowledgement, release of all resources
data in transit may be lost
send FIN(seq=x)
receiveACK+FIN
sendACK(SEQ=y+1)
Host 1Peer
Host 2Peer
receive FIN
send FIN(SEQ=y)ACK(SEQ=x+1)
receive ACK
time time
sendACK(SEQ=x+1)receive ACK
05. Apr. 2006 39 INF-3190: Transport Layer
Connection Management Modelling
States
State Description
CLOSED No connection is active or pending
LISTEN Server is waiting for an incoming call
SYN RCVD Connection request has arrived, wait for ACK
SYN SENT Application has started to open a connection
ESTABLISHED Normal data transfer state
FIN WAIT 1 Application has said it is finished
FIN WAIT 2 The other side has agreed to release
TIMED WAIT Wait for all packets to die off
CLOSING Both sides have tried to close simultaneously
CLOSE WAIT The other side has initiated a release
LAST ACK Wait for all packets to die off
05. Apr. 2006 40 INF-3190: Transport Layer
States
CLOSED
LISTEN
SYN RCVD SYN SENT
ESTABLISHED
FIN WAIT 1
FIN WAIT 2
CLOSING
TIME WAIT
CLOSE WAIT LAST ACK
Send SYN
Recv SYN ACK
Send ACK
Send FIN
Recv ACK
Recv FIN Send ACK
Timeout
Send FIN
Recv SYN
Send SYN ACK
Recv FINSend ACK
Recv ACK
Timeout
Recv RST
Recv SYN Send SYN ACK
Send FIN
Recv FIN Send ACKRecv FIN ACKSend ACK
Recv ACK
Send SYN
05. Apr. 2006 41 INF-3190: Transport Layer
Typical State Sequenceof a TCP Client
CLOSED
LISTEN
SYN RCVD SYN SENT
ESTABLISHED
FIN WAIT 1
FIN WAIT 2
CLOSING
TIME WAIT
CLOSE WAIT LAST ACK
Send SYN
Recv SYN,ACKSend ACK
Send FIN
Recv ACK
Recv FIN, Send ACK Timeout
data
05. Apr. 2006 42 INF-3190: Transport Layer
Typical State Sequenceof a TCP Server
CLOSED
LISTEN
SYN RCVD SYN SENT
ESTABLISHED
FIN WAIT 1
FIN WAIT 2
CLOSING
TIME WAIT
CLOSE WAIT LAST ACK
Timeout
Recv SYNSend SYN,ACK
Recv ACK
Recv FIN,Send ACK Send FIN
data
05. Apr. 2006 43 INF-3190: Transport Layer
Transport layer
Reliability and Ordering: Generic approaches
05. Apr. 2006 44 INF-3190: Transport Layer
Reliability and Ordering
Transport layer must handle Packet loss Packet duplication Multiplexing and demultiplexing of connections
Packet loss Retransmission
Used with various ACK and NACK schemes Forward error correction
Not typically used by the transport layer
05. Apr. 2006 45 INF-3190: Transport Layer
Duplicates Initial Situation: Problem
Network has Varying transit times for packets Certain loss rate Storage capabilities
Packets can be Manipulated Duplicated Resent by the original system
after timeout
In the following, uniform term: “Duplicate”
A duplicate originates due to one of the above mentioned reasonsand
Is at a later (undesired) point in time passed to the receiver
Customer Bank
time time
CR
CC
DATA
ACK
REL
CC
ACK
CR
DATA
REL
DUP
DUP
DUP
Moneytransfer
Moneytransferis repeated
05. Apr. 2006 46 INF-3190: Transport Layer
Duplicates Possible error causes and
consequences Cause
Network capabilities Flood-and-prune approach to routing in
wireless networks All acknowledgements lost
Consequence Duplication of sender’s packets Duplicates arrive in the same order as
originals
Cause Man-in-the-middle attack
Packets are captured and replayed
Consequence Controlled duplication of sender’s
packets Duplicates arrive in an order expected
by the application
Result Without additional
means Receiver cannot
differentiate between correct data and duplicated data
Would re-execute the transaction
CC
ACK
CR
DATA
REL
DUP
DUP
DUP
05. Apr. 2006 47 INF-3190: Transport Layer
Duplicates: Problematic Issues 3 somehow disjoint problems
How to handle duplicates within a connection?
What characteristics have to be taken into account regarding …
Consecutive connectionsor
Connections which are being re-established after a crash?
What can be done to ensure that a connection has been established?
Has actually been initiated byand
With the knowledge of both communicating parties?
05. Apr. 2006 48 INF-3190: Transport Layer
Duplicates: Methods of Resolution
• Using temporarily valid ports
• Method• Port valid for one connection only• Generate always new port
• Evaluation• In general not applicable:
process server addressing method not possible, because
• Server is reached via a designated port• Some ports always exist as "well known“
05. Apr. 2006 49 INF-3190: Transport Layer
Duplicates: Methods of Resolution
• Identify connections individually
• Method• Each individual connection is assigned a new sequence
number and• End-systems remember already assigned sequence
numbers
• Evaluation• End-systems must be capable of storing this information• Prerequisite
• Connection oriented system• End-systems, however, will be switched off and
it is necessary that the information is reliably available whenever needed
05. Apr. 2006 50 INF-3190: Transport Layer
Duplicates: Methods of Resolution
• Identify packets individually• Individual sequential numbers for each packet
• Method• Sequence number basically never gets reset• e.g. 48 bit at 1000 msg/sec: reiteration after ~8930 years
• Evaluation• Higher usage of bandwidth and memory• Sensible choice of the sequential number range depends
on• The packet rate• A packet’s probable "lifetime" within the network
• Discussed in more detail in the following
05. Apr. 2006 51 INF-3190: Transport Layer
Duplicates: Limiting Packet Lifetime• Enabling the above Method 3
Identify packets individually: individual sequential numbers for each packet
• Sequence number only reissued if• All packets with this sequence
numberor references to this sequence number are extinct
• i.e., ACK (N-ACK) have to be included• Otherwise new packet may be
wrongfully confirmed or non-confirmed by delayed ACK (N-ACK)
• Mandatory prerequisite for this solution
• Limited packet lifetime• I.e. introduction of a respective
parameter T
Example 1 (in principle)
T=2t+
MSG(s) t
MSG(s)t
Example 2: Request/responseTaking processing time into account
T=2t+Emax
REQ(s) t
RES(s)t
Emax
05. Apr. 2006 52 INF-3190: Transport Layer
Duplicates: Limiting Packet Lifetime• Limitation by appropriate network design
• Inhibit loops• Limitation of delays in subsystems & adjacent systems
• Hop-counter / time-to-live in each packet• Counts traversed systems• If counter exceeds maximum value
packet is discarded• Requirement: known maximal time for one hop
• Time marker in each packet• Packet exceeds maximum configurable lifetime
packet is discarded• Requirement: “consistent” network time
05. Apr. 2006 53 INF-3190: Transport Layer
Duplicates: Limiting Packet Lifetime• Determining maximum time T, which a packet
may remain in the network• T is a small multiple of the (real maximal) packet
lifetime t• T time units after sending a packet
• The packet itself is no longer valid• All of its (N)ACKs are no longer valid
05. Apr. 2006 54 INF-3190: Transport Layer
Handling of Consecutive Connections• Method
• End-systems timer continues to run at switch-off / system crash• Allocation of initial sequence number (ISN) depends on
• time markers (linear or stepwise curve because of discrete time)• Sequence numbers can be allocated consecutively within a
connection (steadily growing curve)
time
SeqN
o
ISNInitial Sequence Number
“Forbidden Region”Width T (max. theoretic packet lifetime)
e.g. 232-1 Wraparound
ISN ISN
T
05. Apr. 2006 55 INF-3190: Transport Layer
Handling of Consecutive Connections
• No problem, if• “Normal lived” session (shorter than wrap-around time) with data rate
smaller than ISN rate (ascending curve less steep)• Then, after crash
• Reliable continuation of work always ensured• System clock may be used to continue with correct ISN
time
SeqN
o
TT T
05. Apr. 2006 56 INF-3190: Transport Layer
Handling of Consecutive Connections
• Problems• “Long-lived”, “slow” session (longer than wrap-around
time)• Sequence number is used within time period T before it is
used as initial sequence number ”Forbidden Region" - begins T before ISN is generated
• High data rate• Curve of the consecutively allocated sequence numbers
steeper than ISN curve (enters from underneath)
time
SeqN
oTT T
Same SeqNoWithin T
Packet rateToo high
Same SeqNoWithin T
05. Apr. 2006 58 INF-3190: Transport Layer
TCP’s approach to reliability and ordering
TCP over IP situation Provide connection-oriented, reliable, ordered transport
layer serviceoverconnectionless, unreliable, unordered network layer service
TCP’s approach Limiting packet lifetime using TTL at IP level Choosing maximum packet lifetime T Unique connection identifier
(client addr, client port, server address, server port) Reuse limited by packet lifetime
Individual sequential numbers per connection
05. Apr. 2006 59 INF-3190: Transport Layer
TCP’s approach to reliability and ordering
TCP/IP term: Maximum Segment Lifetime (MSL) 2MSL: two MSLs to wait in TIME_WAIT Solaris 2MSL: 4 minutes default Linux 2MSL: 1 minute Windows 2MSL: 30 seconds default
Problem with 2MSL for uniquely identifying connections
Packets from connections that can not be distinguished Especially: fast restart after crash
Solution: none
05. Apr. 2006 60 INF-3190: Transport Layer
TCP’s approach to reliability and ordering
Problem with sequence numbers per connection 32 bit sequence numbers with technology considered
as sufficient when designing TCP/IP Sequence number range exploitation
today at 1 Gbps in 17 sec
Solution: verified sequence number PAWS – RFC1323
Use TCP 32-bit timestamp option in each packet "Protect Against Wrapped Sequence Numbers“ "TCP extensions for highspeed paths“ Reject packet
If timestamp is lower than last recordedand sequence number is higher
05. Apr. 2006 62 INF-3190: Transport Layer
Flow Control on Transport Layer Compared to flow control on data link layer
Joint characteristics Fast sender shall not flood slow receiver Sender shall not have to store all not acknowledged packets
Differences Link layer: router serves few “bitpipes” Transport layer: end-system contains a multitude of
Connections Data transfer sequences
Transport layer: receiver may store packets
Strategies Sliding window / static buffer allocation Sliding window / no buffer allocation Credit mechanism / dynamic buffer allocation
05. Apr. 2006 63 INF-3190: Transport Layer
Sliding Window / Static Buffer Allocation
Flow control Sliding window - mechanism with window size w
Buffer reservation Receiver reserves 2*w buffers per duplex connection
Characteristics Receiver can accept all packets Buffer requirement proportional to number of connections Poor buffer utilization for low throughput connections
I.e. Good for traffic with high throughput
(e.g. data transfer) Poor for bursty traffic with low throughput
(e.g. interactive applications)
05. Apr. 2006 64 INF-3190: Transport Layer
Sliding Window / No Buffer Allocation
Flow control Sliding window (or no flow control)
Buffer reservation Receivers do not reserve buffers Buffers allocated out of shared buffer space upon arrival of
packets Packets are discarded if there are no buffers available Sender maintains packet buffer until ACK arrives from receiver
Characteristics Optimized storage utilization Possibly high rate of ignored packets during high traffic load
I.e. => Good for bursty traffic with low throughput => Poor for traffic with high throughput
05. Apr. 2006 65 INF-3190: Transport Layer
Credit Mechanism Flow control
Credit mechanism
Buffer reservation Receiver allocates buffers dynamically for the connections Allocation depends on the actual situation
Principle Sender requests required buffer amount Receiver reserves as many buffers as the actual situation
permits Receiver returns ACKs and buffer-credits separately
ACK: confirmation only (does not imply buffer release) CREDIT: buffer allocation
Sender is blocked when all credits are used up
05. Apr. 2006 66 INF-3190: Transport Layer
Credit Mechanism Dynamic adjustment to
Buffer situation Number of open connections Type of connections
high throughput: many buffers low throughput: few buffers
05. Apr. 2006 67 INF-3190: Transport Layer
Credit MechanismSender Receiver
time time
<req 8 buffers>
<cred=4>
<seq=0, data=m0>
<seq=1, data=m1>
<seq=2, data=m2>
<ack=1, cred=3>
<seq=3, data=m3>
<seq=4, data=m4>
<seq=2, data=m2>
<ack=4, cred=0>
<ack=4, cred=1>
<ack=4, cred=2>
<seq=5, data=m5>
<seq=6, data=m6>
<ack=6, cred=0>
<ack=6, cred=4>
A wants 8 buffers
A has 3 buffers left
A has 2 buffers left
Message lost but A thinks it has 1 left
A has 1 buffer left
A has 0 buffer left, must stop
A times out and retransmits
A still blocked
A may now send next msg.
A has 1 buffer left
A is now blocked again
A still blocked
B grants messages 0-3 only
Message lost
B acknowledges 0 and 1permits 2-4
Everything acknowledgedbut no free buffers
B found a new buffersomewhere
A has 1 buffer left
A is now blocked again
05. Apr. 2006 69 INF-3190: Transport Layer
TCP Flow Control Variable window sizes (credit mechanism)
Implementation The actual window size is also transmitted with each
acknowledgement Permits dynamic adjustment to existing buffer
Destination AddressSource address
Time to live Protocol Header checksumIdentification DM Fragment offset
Version IHL Type of service Total lengthPRE ToS
Data
OptionsSource port Destination port
Sequence numberPiggyback acknowledgement
THL
THL – TCP header lengthU: URG – urgentA: ACK – acknowledgementP: PSH – pushR: RST – resetS: SYN – syncF: FIN – finalize
F WindowSRPAUunusedChecksum Urgent pointer
Options (0 or more 32 bit words)
Sequence number andacknowledgements
Credit
05. Apr. 2006 71 INF-3190: Transport Layer
TCP Flow Control “Sliding window” mechanism
Sequence number space is limited No buffers longer than 232 possible No credit larger than 216 possible
Acknowledgement and sequence number Acknowledgments refer to byte positions
Not to whole segment Sequence numbers refer to the byte position of a TCP connection
Positive acknowledgement Cumulative acknowledgements
Byte position in the data stream up to which all data has been received correctly
Reduction of overhead More tolerable to lost acknowledgements
Window Scale Option - RFC 1323
05. Apr. 2006 72 INF-3190: Transport Layer
TCP Flow ControlSender Receiver
time time
2K / SEQ=0
ACK=2048 WIN=2048
2K / SEQ=2048
ACK=4096 WIN=0
ACK=4096 WIN=2048
1K / SEQ=4096
Application doesa 2K write
Application doesa 2K write
Sender could send 2Kbut application does
only a 1K write
Sender is blocked
Sender is blocked
4K buffer
2K
Application reads 2K
2K1K
2K
05. Apr. 2006 73 INF-3190: Transport Layer
TCP Flow Control: Special Cases Optimization for low throughput rate
Problem Telnet (and ssh) connection to interactive editor reacting on every
keystroke 1 character typed requires up to 162 byte
Data: 20 bytes TCP header, 20 bytes IP header, 1 byte payload ACK: 20 bytes TCP header, 20 bytes IP header Echo: 20 bytes TCP header, 20 bytes IP header, 1 byte payload ACK: 20 bytes TCP header, 20 bytes IP header
Approach often used Delay acknowledgment and window update by 500 ms (hoping for more
data)
Nagle’s algorithm, 1984 Algorithm
Send first byte immediately Keep on buffering bytes until first byte has been acknowledged Then send all buffered characters in one TCP segment and start buffering
again Comment
Effect at e.g. X-windows: jerky pointer movements
05. Apr. 2006 74 INF-3190: Transport Layer
TCP Flow Control: Special Cases
Silly window syndrome (Clark, 1982) Problem
Data on sending side arrives in large blocks But receiving side reads data at one byte at a time only
Clark’s solution Prevent receiver from sending window update for 1 byte Certain amount of space must be available in order to send window update
min(X,Y)X = maximum segment size announced during connection establishmentY = buffer / 2
Senderwindow size=0
Header
Header
1 byte
Receive buffer full
Room for one byte
Receive buffer full
New byte arrives
Window update segment sent
Application reads one byte