The Transport Layer: Tutorial and Survey - ACM Digital Library

The Transport Layer: Tutorial and SurveySAMI IREN and PAUL D. AMER

University of Delaware

AND

PHILLIP T. CONRAD

Temple University

Transport layer protocols provide for end-to-end communication between two ormore hosts. This paper presents a tutorial on transport layer concepts andterminology, and a survey of transport layer services and protocols. The transportlayer protocol TCP is used as a reference point, and compared and contrasted withnineteen other protocols designed over the past two decades. The service andprotocol features of twelve of the most important protocols are summarized in bothtext and tables.

Categories and Subject Descriptors: C.2.0 [Computer-CommunicationNetworks]: General—Data communications; Open System InterconnectionReference Model (OSI); C.2.1 [Computer-Communication Networks]: NetworkArchitecture and Design—Network communications; Packet-switching networks;Store and forward networks; C.2.2 [Computer-Communication Networks]:Network Protocols; Protocol architecture (OSI model); C.2.5 [Computer-Communication Networks]: Local and Wide-Area Networks

General Terms: Networks

Additional Key Words and Phrases: Congestion control, flow control, transportprotocol, transport service, TCP/IP

1. INTRODUCTION

In the OSI 7-layer Reference Model, thetransport layer is the lowest layer thatoperates on an end-to-end basis be-tween two or more communicatinghosts. This layer lies at the boundarybetween these hosts and an internet-

work of routers, bridges, and communi-cation links that moves information be-tween hosts. A good transport layerservice (or simply, transport service) al-lows applications to use a standard setof primitives and run on a variety ofnetworks without worrying about differ-ent network interfaces and reliabilities.

This work was supported, in part, by the National Science Foundation (NCR 9314056), the U.S. ArmyResearch Office (DAAL04-94-G-0093), and the Adv Telecomm/Info Dist’n Research Program (ATIRP)Consortium sponsored by ARL under Fed Lab Program, Cooperative Agreement DAAL01-96-2-0002.Authors’ addresses: S. Iren and P. D. Amer, Department of Computer and Information Sciences,University of Delaware, Newark, DE 19716; P. T. Conrad, Department of Computer and InformationSciences, Temple University, Philadelphia, PA 19122.Permission to make digital / hard copy of part or all of this work for personal or classroom use is grantedwithout fee provided that the copies are not made or distributed for profit or commercial advantage, thecopyright notice, the title of the publication, and its date appear, and notice is given that copying is bypermission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute tolists, requires prior specific permission and / or a fee.© 2000 ACM 0360-0300/99/1200–0360 $5.00

ACM Computing Surveys, Vol. 31, No. 4, December 1999

Essentially, the transport layer isolatesapplications from the technology, de-sign, and idiosyncracies of the network.

Dozens of transport protocols havebeen developed or proposed over the lasttwo decades. To put this research inperspective, we focus first on the fea-tures of probably the most well-knowntransport protocol — namely the Inter-net’s Transmission Control Protocol

(TCP) — and then contrast TCP withmany alternative designs.

Section 2 introduces the basic con-cepts and terminology of the transportlayer through a simple example illus-trating a TCP connection. Section 3 sur-veys the range of different services thatcan be provided by a transport layer.Similarly, Section 4 surveys the rangeof protocol designs that provide theseservices. (The important distinction be-tween service and protocol is a majortheme throughout this paper.) Section 5briefly surveys nine widely imple-mented transport protocols other thanTCP (UDP, TP0, TP4, SNA-APPN,DECnet-NSP, ATM, XTP, T/TCP andRTP) and two others that, although notwidely implemented, have been particu-larly influential (VMTP and NETBLT).This section also includes briefer de-scriptions of eight experimental proto-cols that appear in the research litera-ture (Delta-t, MSP, SNR, DTP, k-XP,TRUMP, POC, and TP11). Section 6concludes the paper with an overview ofpast, current, and future trends thathave influenced transport layer designincluding the impact of wireless net-works. This section also presents a fewof the debates concerning transport pro-tocol design. As an appendix, tables areprovided summarizing TCP and elevenof the transport protocols discussed inSection 5. Similar tables for the experi-mental protocols are omitted for reasonsof space, but are available on the au-thors’ Web site: www.eecis.udel.edu/˜amer/PEL/survey/ .

This survey concentrates on unicastservice and protocols — that is, commu-nication between exactly two hosts (ortwo host processes). Multicast protocols[Armstrong et al. 1992; Bormann et al.1994; Braudes and Zabele 1993; Deer-ing 1989; Floyd et al. 1995; McCanne etal. 1996; Smith and Koifman 1996] pro-vide communication among n $ 2hosts. Multicast represents an impor-tant research area currently undergoingsignificant change and development,and is worthy of a separate survey.

CONTENTS

1. Introduction2. Transport Layer Concepts and Terminology

2.1 Introduction to TCP2.2 General Role of the Transport Layer2.3 Terminology: SDUs, PDUs, and the like2.4 Example TCP Connection, Step-by-Step2.5 What this example shows. . . and does not show

3. Transport Service3.1 CO-message vs. CO-byte vs. CL3.2 Reliability3.3 Blocking vs. Non-Blocking3.4 Multicast vs. Unicast3.5 Priority vs. No-priority3.6 Security vs. No-security3.7 Status-reporting vs. No-status-reporting3.8 Quality-of-service vs. No-quality-of-service

4. Transport Protocol Features4.1 Connection-oriented vs. Connectionless4.2 Transaction-oriented4.3 CO Protocol Features4.4 Error Control: Sequence Numbers, Acks, and

Retransmissions4.5 Flow/Congestion Control4.6 Multiplexing/Demultiplexing4.7 Splitting/Recombining4.8 Concatenation/Separation4.9 Blocking/Unblocking4.10 Segmentation/Reassembly

5. Transport Protocol Examples5.1 UDP5.2 TP45.3 TP05.4 NETBLT5.5 VMTP5.6 T/TCP5.7 RTP5.8 APPN (SNA)5.9 NSP (DECnet)5.10 XTP5.11 SSCOP/AAL5 (ATM)5.12 Miscellaneous Transport Protocols

6. Future Directions and Conclusion6.1 Impacts of Trends and New Technologies6.2 Wireless Networks6.3 Debates6.4 Final Observations

APPENDIX

Transport Layer • 361


A previous study surveying eighttransport protocols can be found in Do-eringer et al. [1990].

2. TRANSPORT LAYER CONCEPTS ANDTERMINOLOGY

From an application programmer’s per-spective, the transport layer providesinterprocess communication betweentwo processes that most often are run-ning on different hosts. This section in-troduces some basic transport layer con-cepts and terminology through anexample: a simple document retrievalover the World Wide Web (herein Web)utilizing the TCP transport protocol.

2.1 Introduction to TCP

Although we provide a broad survey ofthe transport layer, the service and pro-tocol features of TCP are used through-out this paper as a point of reference.

Over the last two decades the Inter-net protocol suite (also called theTCP/IP protocol suite) has come to bethe most ubiquitous form of computernetworking. Hence, the most widelyused transport protocols today are TCPand its companion transport protocol,the User Datagram Protocol (UDP). Afew other protocols are widely used,mainly because of their connection tothe proprietary protocol suites of partic-ular vendors. Examples include thetransport protocols from IBM’s SNA,and Digital’s DECnet. However, thesuccess of the Internet has led nearly allvendors in the direction of TCP/IP asthe future of networking.

The Internet’s marked success wouldnot alone be sufficient justification fororganizing a survey around a single pro-tocol. Also important is that TCP pro-vides examples of many significant is-sues that arise in transport protocoldesign. The design choices made in TCPhave been the subject of extensive re-view, experimentation, and large-scaleexperience, involving the best research-ers and practitioners in the field. TCPrepresents the culmination of many

years of thought about transport proto-col design.

A final reason that TCP provides agood starting point for study, is that thehistory of research and development onTCP can be traced in publicly availabledocuments. Ongoing research and de-velopment of transport protocols, partic-ularly TCP, is the focus of two workinggroups of the Internet Society. Theend2end working group of the InternetResearch Task Force (IRTF, www.irtf.org ) discusses ongoing long-term re-search on transport protocols in general(including TCP), while the tcp-implgroup of the Internet Engineering TaskForce (IETF, www.ietf.org ) focuses onshort-term TCP implementation issues.Both groups maintain active mailinglists where ideas are discussed and de-bated openly. The work of these groupscan be found in journal articles, confer-ence proceedings, and documentsknown as Internet Drafts and Requestsfor Comments (RFCs). RFCs contain notonly all the Internet Standards, but alsoother information of historical and tech-nical interest. It is much more difficultfor researchers to obtain similar infor-mation concerning proprietary proto-cols.

2.2 General Role of the Transport Layer

To illustrate the role that the transportlayer plays in a familiar application, theremainder of Section 2 examines therole of TCP in a simple interaction overthe Web.

The Web is an example of a client/server application. A human interactswith a Web browser (client) running ona “local” machine. The Web browsercommunicates with a server on some“remote” machine. The Web uses an ap-plication layer protocol called the Hy-pertext Transfer Protocol (HTTP) [Bern-ers-Lee et al. 1996]. HTTP is a simplerequest/response protocol. Suppose, forexample, that you have a personal com-puter with Internet access, and youwish to retrieve the page “http://www.eecis.udel.edu/research.html ”

362 • S. Iren et al.


from the University of Delaware Website. In the simplest case,1the clientsends a request containing the filenameof the desired Web page (“GET /re-search.html ”) to a server (“www.eecis.udel.edu ”), and the serversends back a response consisting of thecontents of that file.

This communication takes place overa complex internetwork of computersthat is constantly changing in terms ofboth technology and topology. A connec-tion between two particular hosts mayinvolve such diverse technologies asEthernet, Token Ring, X.25, ATM, PPP,SONET, just to name a few. However, aprogrammer writing a Web client orserver does not want to be concernedwith the details of how communicationtakes place between client and server.The programmer simply wants to sendand receive messages in a way that doesnot change as the underlying networkchanges. This is the function of thetransport layer: to provide an abstrac-tion of interprocess communication thatis independent of the underlying net-work.

HTTP uses TCP as the transportlayer. The programmer writing code foran HTTP client or server would accessTCP’s service through function callsthat comprise that transport layer’s Ap-plication Program Interface (API). At aminimum, a transport layer API pro-vides functions to send and receive mes-sages; for example, the Berkeley SocketsAPI provides functions called write()and read() (for more details, seeStevens [1998]).

Because TCP is connection-oriented,the Berkeley Sockets API also providesa connect() function for setting up aconnection between the local and remoteprocesses. It also provides a close()

function for closing a connection. Notethat while TCP is connection-oriented,not all transport services establish aconnection before data is sent. Connec-tion-oriented and connectionless ser-vices and protocols are discussed in Sec-tions 3.1, 3.2.5 and 4.1.

2.3 Terminology: SDUs, PDUs, and the like

One difficulty in summarizing any topicis the wide range of terms used forsimilar concepts. Throughout this pa-per, we use a simplified communicationmodel (Figure 1) that employs some OSIterminology. At the top layer, a usersender (e.g., a Web client) has somemessages to communicate to the userreceiver (e.g., a Web server). These so-called application entities use the ser-vice of the transport layer. Communica-tion between peer entities consists of anexchange of Protocol Data Units(PDUs). Application peers communicateusing Application PDUs (APDUs), whiletransport peers communicate usingTransport PDUs (TPDUs), etc. In ourWeb example, the first APDU is therequest “GET /research.html ” sentfrom the client (application entity) tothe server (its peer application entity).The Web server will respond with anAPDU containing the entire text of thefile “research.html ”.

Many transport and application pro-tocols are bidirectional; that is, bothsides can send and receive data simulta-neously. However, it is frequently use-ful to focus on one direction while re-maining aware that the other directionis also operational. As Figure 1 shows,each application entity can assume boththe role of sender and receiver; for theAPDU “GET /research.html ”, the cli-ent is the user sender and the server isthe user receiver (as shown by moreprominent labels). When the APDU con-taining the contents of the file “re-search.html ” is sent, user sender anduser receiver reverse roles (as indicatedby the dotted line boxes, and the lighteritalicized labels).

The term transport entity refers to the

1To simplify the discussion, we will assume HTTPversion 0.9 and a document containing only hyper-text: no inline images, applets, etc. This avoidsdiscussion of HTTP 1.0 headers, persistent con-nections as in HTTP 1.1, (which complicate theissue of how and when the connection is closed,)and the necessity for multiple connections whereinline images are involved.



hardware and/or software within agiven host that implements a particulartransport service and protocol. Again,even when the protocol is bidirectional,we focus on one direction for purposes ofclarity. In this model, the user sendersubmits a chunk of user data (i.e., aTransport Service Data Unit (TSDU), orinformally, a message) to the transportsender. The transport sender transmitsor sends this data to the transport re-ceiver over a network which may providedifferent levels of reliability. The trans-port receiver receives the data that ar-rives from the network and delivers it tothe user receiver. Note that even whenone transport entity assumes the role ofsender and the other assumes the roleof receiver, we use solid lines to showTPDUs flowing in both directions. Thisillustrates that TPDUs may flow in bothdirections even when user data flowsonly from sender to receiver. TPDUsfrom receiver to sender are examples ofcontrol TPDUs, which are exchangedbetween transport entities for connec-tion management. When the flow ofuser data is bidirectional, control anddata information can be piggybacked, asdiscussed in Section 4. Control TPDUsmay flow in both directions betweensender and receiver, even in the absenceof user data.

Figure 2 shows the terminology weuse to describe what happens to therequest APDU “GET /research.html ”as it passes through the various layers

on its way from the Web client to theWeb server. When the user sender sub-mits the request APDU to the transportsender, that APDU becomes a TSDU.The transport sender adds its ownheader information to the TSDU, to con-struct a TPDU that it can send to thetransport receiver. TPDUs exchangedby the transport entities are encapsu-lated (i.e., contained) in NPDUs whichare exchanged between the network en-tities, as illustrated in Figure 2. Thenetwork layer routes NPDUs betweenthe local and remote network entitiesover intermediate links. When anNPDU arrives, the network layer entityprocesses the NPDU header and passesthe payload of the NPDU to a transportlayer entity. The transport entity eitherpasses the payload of the TPDU to thetransport user if it is user data, or pro-cesses the payload itself if it is a controlTPDU.

In the previous paragraph we de-scribe a single APDU becoming a singleTSDU, being encapsulated in a singleTPDU, which in turn becomes a singleNSDU encapsulated in a single NPDU.This is the simplest case, and one thatis likely to occur for a small APDU suchas the HTTP request in our example.However, there are many other possibil-ities for the relationships between AP-DUs, TSDUs, TPDUs, NSDUs, and NP-DUs, as described in Step (5) of Section2, and also in Sections 3 and 4.

Figure 2 also shows some of the ter-

Transport Sender

User Sender

Network

User Receiver

TSAP

NSAP

Host A Host B

User/TransportInterface

Transport/NetworkInterface

TPDUs

APDUs

Application (e.g. web client) Application (e.g. web server)

Transport Receiver

Arrive/Receive

Submit

Transmit/Send

Deliver

(Receiver) (Sender)

(Receiver) (Sender)

ApplicationEntities

TransportEntities

Figure 1. Transport service.



minology commonly used in the Internetprotocol suite to identify PDUs at vari-ous layers. TCP’s PDUs are called seg-ments, while IP’s NPDUs are calleddatagrams. However, the Internet com-munity’s terminology can be confusing.For example, the term packet is oftenused informally to refer to both IP Data-grams (NPDUs) and TCP segments (TP-DUs). The term datagram can refer ei-ther to IP’s NPDUs, or to the TPDUs ofthe User Datagram Protocol. To avoidsuch confusion, we make use of someOSI terms throughout the paper. Al-though some may criticize OSI termi-nology as being cumbersome, it oftenhas the advantage of being more pre-cise.

2.4 Example TCP Connection, Step-by-Step

Now let us consider the Web documentretrieval example, one step at a time.This discussion omits many details; thepurpose here is to provide a generalunderstanding of the transport layer inthe context of a well-known application.

(1) You select “http://www.eecis.udel.edu/research.html ” withyour Web client.—http indicates the application

layer protocol to be used; moreimportantly from a transportlayer perspective, it also implic-itly indicates the TCP port num-

ber that will be used for the con-nection (80, in this case). Portnumbers differentiate betweenmultiple connection points on thesame machine; for example, be-tween servers for Web, File Trans-fer Protocol (FTP) and telnet (re-mote terminal) requests.

—“www.eecis.udel.edu ” indicatesthe host name to which a TCPconnection is established. This isconverted to the IP address128.175.2.17 by the DomainName System (which is outsidethe scope of this discussion). Thecombination of IP address and TCPport number (128.175.2.17, 80)represents what OSI calls a TSAPaddress. A TSAP (Transport Ser-vice Access Point) address identi-fies one endpoint of a communica-tion channel between a process ona local machine, and a process ona remote machine.

—“/research.html ” is the file youare requesting; the Web clientuses this to form the HTTP request(APDU) “GET /research.html ”that must be sent to the Webserver via TCP.

(2) The Web client starts by making aconnection request to the transportentity at (128.175.2.17, 80) bycalling the connect() function.This causes the local-client TCP en-

TSDU

NSDU

TPDU (e.g., TCP segment)

NPDU (e.g., IP Datagram)

TPDUheader

NPDUheader

TransportLayer

NetworkLayer

Application(or Session)Layer

APDU (e.g., "GET /research.html")

Figure 2. Relationship between Service Data Units (SDUs) and Protocol Data Units (PDUs).



tity (or simply, the “local TCP”) toinitiate a 3-way-handshake with theremote-server TCP entity (or sim-ply, the “remote TCP”). TPDUs areexchanged between the TCP entitiesto ensure reliable connection estab-lishment, and to establish initial se-quence numbers to be used by thetransport layer (see Figure 3, andSection 4). If the 3-way-handshakefails, TCP notifies the application byreturning an error code as the resultof the connect() function; other-wise a success code is returned. Inthis way, TCP provides what OSIcalls confirmation of the connect re-quest.

(3) Once the connect request is con-firmed, the Web client submits arequest to send data (in this casethe APDU “GET /research.html ”).The local TCP then sends this data— most likely, in a single TPDU.This TPDU (i.e., TCP segment) con-tains the service user’s data (theTSDU), plus a transport layerheader, containing (among otherthings) the negotiated initial se-quence number for the data. Thepurpose of this sequence number isdiscussed in Step (2).

(4) When the remote TCP receives theTPDU, the data “GET /re-search.html ” is buffered. It is de-livered when the Web server does aread() . In OSI terminology, thisdelivery is known as a data indica-tion. The remote TCP also sendsback an acknowledgment (ACK) con-trol TPDU to the local TCP. Ac-knowledgments inform the trans-port sender about TPDU arrivals(see Section 4).

(5) The Web server then responds withthe contents of “research.html ”.This file may be too large to beefficiently submitted to TCP in onewrite() call, i.e., one TSDU. If so,the Web server divides this APDUinto multiple write() calls, i.e.,multiple TSDUs. The remote TCP

then sends these TSDUs to the localTCP in multiple TPDUs, utilizingthe service of the underlying net-work layer.One interesting aspect of TCP isthat the sizes of the TSDUs submit-ted by the transport entities maybear little or no relationship to thesizes of the TPDUs actually ex-changed by TCP. TCP treats theflow of data from sender to receiveras a byte-stream and segments it inwhatever manner seems to makebest use of the underlying networkservice, without regard to TSDUboundaries. It delivers data in thesame fashion; thus for TCP, theboundaries between APDUs, sub-mitted TSDUs, TPDUs, and deliv-ered TSDUs may all be different.Other transport services/protocolshave a message orientation, mean-ing that they preserve the bound-aries of TSDUs (see Section 3).

(6) Because the network layer (IP inthis case) can lose or reorder NS-DUs, TCP must detect and recoverfrom network errors. As the remoteTCP sends the TPDUs containingthe successive portions of the file“research.html ”, it includes a se-quence number in each TPDU corre-sponding to the sequence number ofthe first byte in that TPDU relativeto the entire flow of data from re-mote TCP to local TCP. The remoteTCP also places a copy of the data ineach TPDU sent into a buffer, andsets a timer. If this timer expiresbefore the remote TCP receives amatching ACK TPDU from the localTCP, the data in the TPDU will beretransmitted in a new TPDU. InTCP, the data are tracked by indi-vidual byte-stream sequence num-bers, and TPDUs retransmitted mayor may not correspond exactly to theoriginal TPDUs. The remote TCPalso places a checksum in the TPDUheader to detect bit errors intro-duced by noise or software errors inthe network, data link and physical



layers. Such error control is a majorfunction of the transport layer asdiscussed in Section 4.

(7) As TPDUs are received by the localTCP, any TPDUs with checksum er-rors are discarded. The sequencenumbers of the incoming TPDUs arechecked to ensure that no pieces ofthe byte-stream are missing or havearrived out-of-order. Any out-of-or-der pieces will be buffered and re-ordered. As TPDUs arrive, the localTCP responds to the remote TCPwith ACK TPDUs. If ACKs are lost,the remote TCP may send a TPDUwhich duplicates data sent in anearlier TPDU. The local TCP alsodiscards any duplicate data that ar-rives. By detecting and recoveringfrom loss, reordering, duplicationand noise, the local and remote TCPentities cooperate to ensure reliabledelivery of data. As pieces of thebyte-stream arrive correctly, thesewill be buffered until the Web clientrequests them by doing read()calls. Each read() results in deliv-ery of a TSDU to the Web client.

(8) In general, a TCP connection is bidi-rectional, and either side may ini-tiate the closing of the connection.However, in first-generation Websystems, the server initiates theclose by calling the close() func-tion after it sends the last piece ofthe file requested. In OSI terminol-ogy, this is called a disconnect re-quest. TCP handles disconnect re-quests with a 4-way-handshakeprocedure (shown in Figure 4) toensure graceful termination withoutloss of data. The remote TCP re-sponds to the server’s disconnect re-quest by sending a disconnect TPDUand waiting for an ACK TPDU. Thelocal TCP signals the server’s dis-connect request to the client by re-turning “end-of-file” as the result ofa client read() request. The clientthen responds with its own close()request; this causes another ex-

change of a disconnect TPDU andACK. This procedure and othermethods of connection terminationare discussed in Section 4.3.4.

2.5 What this example shows. . . and doesnot show

From this example, we see that TCPprovides a connection-oriented (CO)byte-stream, no-loss, no-duplicates, or-dered transport service, and that itsprotocol is designed to operate on top ofa lossy network service, such as IPdatagram service [Postel 1981].

This example also shows a few as-pects of communication between theuser layer and the transport layer, andbetween peer entities. It illustrates oneexample of connection establishment,reliable data exchange, and connectiontermination. However, many aspects ofthis example were not discussed, suchas flow control and congestion avoid-ance, piggybacking of ACKs, and round-trip-time estimation, to name just a few.

Also, consider that in the case of aWeb browser using TCP, the transportservice is reliable, which means (amongother things) that the transport servicemay not lose any TSDUs. This guaran-tee comes at the expense of potentiallyincreased delay, since reliable deliverymay require more TPDUs to be ex-changed. Other transport services (forexample, those used for audio or videotransmission) do not provide reliability,since for some applications lower delayis more important than receiving everyTSDU.

Many variations exist both in the typeof service provided by the transportlayer, and in the design of transportprotocols. These variations are whatmake the transport layer so interesting!Sections 3 and 4 provide an overview ofthe most important of these variations.

3. TRANSPORT SERVICE

This section classifies the typical serviceprovided by the transport layer (see Ta-ble I). A transport service abstracts a



set of functions that is provided to ahigher layer. A protocol, on the otherhand, refers to the details of how atransport sender and a transport re-ceiver cooperate to provide that service.In defining a service, we treat the trans-port layer as a black box. In Section 4,we look at internal details of that blackbox.

One of the valuable contributions ofthe OSI Reference Model is its emphasisin distinguishing between service andprotocol. This distinction is not well dif-ferentiated in the TCP/IP protocol suitewhere RFCs intermix service and proto-col concepts, frequently in a confusingmanner. As will be seen in Section 4,there are often several different protocolmechanisms that can provide a givenservice.

3.1 CO-message vs. CO-byte vs. CL

Transport services can be divided intotwo types: connection-oriented and con-nectionless. A connection-oriented (CO)service provides for the establishment,maintenance, and termination of a logi-cal connection between transport users.The transport user generally performsthree distinct phases of operation: con-nection establishment, data transfer,and connection termination. When auser sender wants to transmit somedata to a remote user receiver by usinga connection-oriented service, the usersender first explicitly asks the transportsender to open a connection to the re-mote transport receiver (T-Connect).

Once a connection is established, theuser sender provides the transportsender with the data to be transmitted(T-Data). At the end of the data trans-fer, the user sender explicitly asks thetransport sender to terminate the con-nection (T-Disconnect).

CO service itself has two variations:message-oriented and byte-stream. Inthe former, the user sender’s messages(or what OSI calls service data units(SDUs)) have a specified maximum size,and message boundaries are preserved.When two 1K messages are submittedby a user sender, they are delivered tothe user receiver as the same two dis-tinct 1K messages, never for example asone 2K message or four .5K messages.TP4 is an example that provides mes-sage-oriented service. In byte-streamservice as provided by TCP, the flow ofdata from user sender to user receiver isviewed as an unstructured sequence ofbytes that flow in a FIFO manner.There is no concept of message (or mes-sage boundary or maximum messagesize) for the user sender. Data submit-ted by a user sender is appended to theend of the byte-stream, and the userreceiver reads from the head of thebyte-stream.

An advantage of byte-stream serviceis that it gives the transport senderflexibility to divide TSDUs in ways thatmake better use of the underlying net-work service. However, byte-stream ser-vice has been criticized because it doesnot deliver data to an application inmeaningful units. A message-orientedservice better preserves the semanticsof each APDU down through the lowerlayers for so-called Application LevelFraming [Clark and Tennenhouse 1990](see Section 6). One research team notesthat adaptive applications are mucheasier to develop if the unit of process-ing (i.e., APDU) is strongly tied to theunit of control (i.e., TPDU) [Dabbousand Diot 1995].

A connectionless (CL) service providesonly one phase of operation: data trans-fer. There are no T-Connect and T-Dis-

Table I. Transport Service Features

CO-message vs. CO-byte vs. CLNo-loss vs. Uncontrolled-loss vs. Controlled-lossNo-duplicates vs. Maybe-duplicatesOrdered vs. Unordered vs. Partially-orderedData-integrity vs. No-data-integrity vs. Partial-

data-integrityBlocking vs. Non-blockingMulticast vs. UnicastPriority vs. No-prioritySecurity vs. No-securityStatus-reporting vs. No-status-reportingQuality-of-service vs. No-quality-of-service



connect primitives exchanged (either ex-plicitly or implicitly) between a usersender and the transport sender. Whenthe user sender has some data, it simplysubmits the data to the transportsender. As with CO-message service, ina CL service the user sender submitsmessages, and message boundaries aremaintained. Messages submitted to aCL service as in UDP often are referredto as datagrams.

3.2 Reliability

Unfortunately, the terms reliable andunreliable mean different things to dif-ferent people. To avoid ambiguity in ourdefinition, unless otherwise stated, aservice is reliable if and only if it is allof the following (as defined below): no-loss, no-duplicates, ordered, and data-integrity. If even one feature is lacking,the service is unreliable.

3.2.1 No-loss vs. Uncontrolled-loss vs.Controlled-loss. There are three levelsof delivery guarantee. A no-loss (or at-least-once delivery) service guaranteesthat for all data submitted to the trans-port sender by the user sender, one oftwo results will occur. Either (1) thedata is delivered2 to the user receiver,or (2) the user sender is notified thatsome data may not have been delivered.It never occurs that a user sender be-lieves some data was delivered to theuser receiver when in fact it was not.For example, TCP provides a no-lossservice using disconnection as a meansof notifying a user sender that data maynot have been delivered.3

An uncontrolled-loss (or best-effort)service does not provide the above as-surance for any of the data. For exam-

ple, UDP provides an uncontrolled-lossservice. Any data submitted to UDP bya user sender may fail to be delivered tothe user receiver without the usersender being notified. Analogously, thedefault service of the United StatesPostal Service is an uncontrolled-lossservice. A user sender may not be noti-fied if a letter is not delivered to theuser receiver.

A controlled-loss service lies in be-tween no-loss and uncontrolled-loss.Loss may occur, but there is controlover the degree of loss. Several varia-tions of a controlled-loss service are pos-sible. For example, a user sender mightrequest different loss levels for differentmessages. The k-XP protocol [Amer etal. 1997; Marasli et al. 1996] supportsthree controlled-loss service classes:

● reliable messages will be retransmit-ted (if needed) until successfully de-livered to the user receiver;

● partially reliable messages will be re-transmitted (if needed) at most ktimes and then dropped if unsuccess-ful, where k is a user sender parame-ter; and

● unreliable messages will be transmit-ted only once.

In the TRUMP protocol [Golden 1997], auser sender assigns a time limit to eachmessage. The transport layer does itsbest to deliver the message before thetime limit; if unsuccessful, the messageis discarded.

3.2.2 No-duplicates vs. Maybe-dupli-cates. A no-duplicates (or at-most-oncedelivery) service (e.g., TCP) guaranteesthat all data submitted to the transportsender will be delivered to the user re-ceiver at most once.

A maybe-duplicates service (e.g.,UDP) does not provide the above guar-antee. Efforts by the protocol may ormay not be made to avoid deliveringduplicates, but delivery of duplicatesmay occur nevertheless.

2Depending on the data-integrity service definedin Section 3, delivered data may or may not con-tain bit errors.3Strictly speaking, for the most common TCP API,namely the Berkeley Sockets API, this is only trueif the application specifies the SOLINGERsocket option; by default, it is possible for a TCPconnection to lose data without notifying the appli-cation during connection teardown [Stevens 1998].



3.2.3 Ordered vs. Unordered vs. Par-tially-ordered. An ordered service (e.g.,TCP) preserves the user sender’s sub-mission order of data when delivering itto the user receiver. It never occurs thata user sender submits two pieces ofdata, first A, then B, and A is deliveredafter B is delivered.

An unordered service (e.g., UDP) doesnot provide the guarantee above. Whilethe service may try to preserve the or-der of data when delivering it to theuser receiver, no guarantee of preserva-tion-of-order is made.

A partially-ordered service guaran-tees to deliver pieces of data in one of aset of permitted orders as predefined bya partial order relation agreed upon bythe user sender and user receiver. Par-tially-ordered service is useful for appli-cations such as multimedia communica-tion or distributed databases wheredelivery of some objects is constrainedby others having been delivered, yetindependent of certain other objects ar-riving [Connolly et al. 1994]. POC is anexample protocol providing partially-or-dered service [Conrad et al. 1996].

3.2.4 Data-integrity vs. No-data-in-tegrity vs. Partial-data-integrity. A da-ta-integrity service ensures with highprobability that all data bits deliveredto a user receiver are identical to thoseoriginally submitted. The actual proba-bility achieved in practice depends onthe strength of the error detectionmethod. Good discussions of error detec-tion methods can be found in Lin andCostello [1982], and McNamara [1998].TCP uses a 16-bit checksum for data-integrity; its goodness is evaluated inPartridge et al. [1995].

A no-data-integrity service does notprovide any guarantees regarding biterrors. Any number of bit errors mayoccur in the delivered data. UDP may ormay not be configured to use a check-sum; in the latter case, UDP provides ano-data-integrity service.

A partial-data-integrity service allowsa controlled amount of bit errors in de-livered data. This service may be ac-

ceptable as a means of achieving higherthroughput. A real-time multimedia ap-plication might request the transportreceiver to tolerate certain levels of biterrors when errored data could be use-ful to the user receiver [Han and Mess-erschmitt 1996]. For example, a Webbrowser retrieving an image might pre-fer to progressively display a partiallydamaged image while waiting for theretransmission and delivery of the cor-rect data (assuming the decompressionalgorithm can process damaged imagedata.)

3.2.5 Remarks on Reliability and COvs. CL. Note that all four aspects ofreliability — loss, duplicates, order, anddata-integrity — are orthogonal; theyare independent functions. For example,ordered service does not imply no-lossservice; some portion of the data mightget lost while the data that is deliveredis guaranteed to be in order.

A wide range of understanding (i.e.,confusion) exists in the relationship be-tween a service being CO or CL andwhether or not it is reliable. These twoservices are orthogonal, yet so manypeople presume a CO service is reliable.Why? In the past when the number ofavailable services and protocols wassmaller, and applications were fewerand less diverse in their service needs,reasoning about connections and reli-ability was simple:

Whereas: TCP service is CO and TCPservice is reliable,

Whereas: TP4 service is CO and TP4service is reliable,

Whereas: X.25 service is CO and X.25service is reliable,

Therefore: CO service [ reliable ser-vice.

Whereas: UDP service is CL andUDP service is unreliable,

Therefore: CL service [ unreliableservice.



We emphasize that these equiva-lences do not hold! It is quite reason-able to design a transport layer provid-ing CO service that is not reliable (e.g.,provides controlled-loss rather than no-loss service, or unordered rather thanordered service), say to support best-effort compressed-image transmission[Amer et al. 1999; Iren and Amer 2000;Iren 1999]. Similarly, a transport layercan provide a CL service that is reliableif an underlying network layer is reli-able.

3.3 Blocking vs. Non-Blocking

A blocking service ensures that thetransport layer is not overwhelmed withincoming data. In this case, a usersender submits data to the transportsender and then waits for the transportsender to signal that the user sendercan resume processing. A blocking ser-vice provides flow control between usersender and transport sender.

A non-blocking service allows the usersender to submit data and continue pro-cessing without awaiting the transportsender’s ok. A non-blocking service doesnot take into account the transport lay-er’s buffering capabilities, or the rate atwhich the user receiver consumes data.The user sender submits data at anyrate it chooses, possibly resulting in ei-ther data loss or notification to the usersender that data cannot be submittedright now.

Of the many transport protocols sur-veyed, all could potentially provideblocking or non-blocking service de-pending on their API; one can thereforeargue that this service feature is animplementation decision, not an inher-ent feature of a given transport layer.(Note that the term “blocking” as usedhere has no relationship to its use inSection 4.9.)

3.4 Multicast vs. Unicast

A multicast service enables a usersender to submit data, a copy of whichwill be delivered to one or more user

receiver(s). Applications such as real-time news and information broadcast-ing, teleconferencing, and remote col-laboration can take advantage of amulticast service to reach multiple des-tinations. Depending on the reliabilityof the service, delivery to each user re-ceiver may be no-loss vs. uncontrolled-loss vs. controlled-loss, no-duplicates vs.maybe-duplicates, etc. The use of a mul-ticast service influences many of theprotocol features to be discussed in Sec-tion 4.

A unicast service limits the data sub-mitted by a user sender to be deliveredto exactly one user receiver.

(As stated earlier, although the tablesin the appendix indicate which protocolssurveyed offer multicast service, thispaper does not survey or discuss multi-cast protocols.)

3.5 Priority vs. No-priority

A priority service enables a user senderto indicate the relative importance ofvarious messages. A user of priority ser-vice may expect higher priority data tobe delivered sooner, if possible, thanlower priority data, even at the expenseof slowing down data submitted earlier.When combined with uncontrolled-lossor controlled-loss service, a priority ser-vice may drop lower priority data whennecessary, thereby allowing the deliveryof higher priority data with smaller de-lay and/or higher probability. Priorityalso may be passed down to a networkservice if it has a notion of priority, suchas IP’s type of service field.

In OSI, priority service is provided inthe form of expedited data and the pri-ority QoS parameter (see Section 3).Expedited data is an entirely separatedata flow that is not subject to normaldata flow control.

A no-priority service does not allow auser sender to differentiate the impor-tance of classes of data.

Despite common misconceptions tothe contrary, TCP’s urgent data serviceis neither a priority service nor an expe-dited data service. The Berkeley Sock-



ets API provides access to this servicevia what it calls out-of-band data. How-ever, this is misleading, since data sentin urgent mode is not expedited, sentout-of-band, or sent with higher prior-ity. Rather, TCP urgent mode is a ser-vice by which the user sender (i.e., anapplication) marks some portion of thebyte-stream as needing special treat-ment by the user receiver. Thus, signal-ing the presence of urgent data andmarking its position in the data streamare the only aspects that distinguish thedelivery of urgent data from the deliv-ery of all other TCP data; for all otherpurposes, urgent data is treated identi-cally to the rest of the TCP byte-stream.The user receiver must read every byteof data exactly in the order it was sub-mitted regardless of whether or not ur-gent mode is used.

3.6 Security vs. No-security

A security service (e.g., TP4) providesone or more security functions such as:authentication, access control, confiden-tiality, and integrity [ISO 1989]. Au-thentication is the verification of usersender’s and user receiver’s identities.Access control checks a user’s permis-sion status before allowing the use ofdifferent resources. Confidentialityguarantees that only the intended userreceiver(s) can decode and understandthe user sender’s data; no other userreceiver can understand the data’s con-tent. Finally, integrity4 detects (andsometimes recovers from) any modifica-tion, insertion, deletion, or replay oftransport sender’s data. Secure routingalso has been noted as a security service[Stallings 1997]. This service guaran-tees that while going from user senderto user receiver, data will only passthrough secure links and secure inter-mediate routers.

A no-security service does not provideany of the above security functions. TCPcurrently provides no-security-service,

although some discussion is underwayto include security features in a futuregeneration.

3.7 Status-reporting vs. No-status-reporting

A status-reporting service allows a usersender to obtain specific informationabout the transport entity or its connec-tions. This information may allow theuser sender to make better use of theavailable service. Some informationthat a status reporting service mightprovide are: performance characteristicsof a connection (e.g., throughput, meandelay), addresses (network, transport),current timer values, or the state of theprotocol machine supporting a connec-tion.

A no-status-reporting service does notprovide any information about thetransport entity and its connections.

A distinction can be made betweenstatus-reporting and QoS monitoring as,for example, in applications based onthe Real-time Transport Protocol (RTP).Although QoS monitoring provides someof the information listed in the abovedefinition (delay, throughput), it pro-vides no information on the protocol’sinternal structure such as state infor-mation, timer values, etc. Status-report-ing service provides explicit interactionsbetween a user and its transport entity.In QoS monitoring, there are no suchexplicit interactions.

Of the protocols surveyed, only TCPprovides anything resembling status-re-porting service. When TCP’s Socket De-bug Option is enabled, the kernel main-tains for future analysis a trace recordof what happens on a connection[Stevens 1994]. Status-reporting is avaluable service that designers shouldconsider incorporating into their proto-col implementations.

3.8 Quality-of-service vs. No-quality-of-service

A transport layer that provides Qualityof Service (QoS) allows a user sender to

4See Section 3.2.4 for a slightly different usage ofintegrity.



specify the quality of transmission ser-vice desired. The protocol then presum-ably optimizes network resources toprovide the requested QoS. Since theunderlying network places limits on theservice that can be provided by thetransport protocol, a user sender shouldrecognize two facts: (1) depending onthe underlying network’s capabilities, atransport protocol will have varying de-grees of success in providing the re-quested QoS, and (2) there is a trade-offamong QoS parameters such as reliabil-ity, delay, throughput, and cost of ser-vice [Marasli et al. 1996].

Transport QoS is a broad and compli-cated issue. No universally accepted listof parameters exists, and how the trans-port layer should behave for a desiredQoS under different circumstances isunclear. The ISO Transport Service[ITU-T 1995c] defines a number of pos-sible performance QoS parameters thatare negotiated during connection estab-lishment. User senders can specify (sus-tained) target, acceptable, and mini-mum values for various serviceparameters. The transport protocol ex-amines these parameters, and deter-mines whether it can provide the re-quired service; this depends in part onthe available network service. ISO spec-ifies eleven QoS parameters:

● Connection Establishment Delay isthe maximum acceptable time be-tween a transport connection beingrequested and its confirmation beingreceived by the user sender.

● Connection Establishment FailureProbability is the probability a con-nection cannot be established withinthe maximum connection establish-ment delay time due to network orinternal problems.

● Throughput is the number of bytes ofuser sender data transferred per unittime over some time interval.

● Transit Delay is the elapsed time be-tween a message being submitted by

a user sender and being delivered tothe user receiver.

● Residual Error Rate is the ratio ofincorrect, lost, and duplicate TSDUsto the total number of TSDUs thatwere sent.

● Transfer Failure Probability is the ra-tio of total transfer failures to totaltransfer samples observed during aperformance measurement.

● Connection Release Delay is the max-imum acceptable time between atransport user initiating release of aconnection and the actual release atthe peer transport service user.

● Connection Release Failure Probabil-ity is the fraction of connection re-lease attempts that did not completewithin the agreed upon connection re-lease delay interval.

● Protection is used by the user senderto specify interest in having thetransport protocol provide protectionagainst unauthorized third partiesreading or modifying the transmitteddata.

● Priority allows a user sender to spec-ify the relative importance of trans-port connections. In case of conges-tion or the need to recover resources,lower-priority connections are de-graded or terminated before the high-er-priority ones.

● Resilience is the probability that thetransport protocol itself will sponta-neously terminate a connection due tointernal or network problems.

A transport layer that provides No-Quality-of-Service (No-QoS) does not al-low a user sender to specify desiredquality of transmission service.

The original definitions of TP0 andTP4 support QoS service, however, onlyrecently have the lower layers of net-works matured sufficiently for the com-munity to consider transport layer QoShandling. The ATM environment sup-ports only two QoS parameters: (sus-



tained) target, acceptable, and mini-mum throughput and transit delay[ITU-T 1995a]. Within the InternetIETF community, much work is ongoingto create an integrated services (INT-SERV) QoS control framework so thatthe TCP/IP suite can provide more thanits current no-QoS service. While notitself a transport protocol, RSVP [Bra-den et al. 1997] sits on top of IP. RSVPallows a host to request specific quali-ties of service from the network. A re-quest may refer to a single path in thecase of unicast or multiple paths in thecase of multicast. General QoS parame-ters supported by the INTSERV frame-work are defined in Shenker and Wro-clawski [1997]. Requests result inreserved resources by individual net-work elements (i.e., subnets, IP routers)along the path. Thus far use of RSVPhas been defined to provide two kinds ofInternet QoS: controlled load [Wroclaw-ski 1997] and guaranteed [Shenker etal. 1997].

4. TRANSPORT PROTOCOL FEATURES

The previous section describes generalfeatures of transport layer service. Thissection focuses on internal transportlayer mechanisms that provide this ser-vice. It is important to understand thatalmost every service can be accom-plished by several different protocols.

4.1 Connection-oriented vs.Connectionless

Not only can a service be classified COor CL, so too can a protocol that pro-vides either service. The distinction de-pends on the establishment and mainte-nance of state information, a record ofcharacteristics and events related to thecommunication between the transportsender and receiver. Perhaps the mostimportant piece of state information isthe sequence number that identifies aTPDU. Consistent viewpoints of the se-quence numbers used by transportsender and transport receiver are re-quired for reliable data transfer. Choos-

ing an initial value for the sequencenumber can be hazardous, particularlywhen two transport entities close andimmediately reopen a connection (seeSection 4.4.7).

A transport protocol is CO (e.g., TCP)if state information is maintained be-tween transport entities. Typically, aCO protocol has three phases: connec-tion establishment, data transfer, andconnection termination. In the connec-tion establishment phase, state infor-mation is initialized either explicitly byexchanging control TPDUs, or implicitlywith the arrival of first data TPDU.During a connection’s lifetime (i.e., datatransfer phase), state information is up-dated to reflect the reception of dataand control PDUs (e.g., ACKs). Whenboth transport entities agree to termi-nate the connection, they discard thestate information. Note the distinctionbetween a CO service and a CO proto-col. A CO service entails a three phaseoperation to establish a logical connec-tion between transport users. A CO pro-tocol entails a three phase operation tomaintain state information betweentransport entities.

If no state information is maintainedat the transport sender and receiver,the protocol is CL. A CL protocol isbased on individually self-containedunits of communication often calleddatagrams that are exchanged indepen-dently without reference to each otheror to any shared state information.Each datagram contains all of the infor-mation that the receiving transport en-tity needs to interpret it.

4.2 Transaction-oriented

Transaction-oriented protocols attemptto optimize the case where a usersender wishes to communicate a singleAPDU (called a request) to a user re-ceiver, who then normally respondswith a single APDU (called a response).Such a request/response pair is called atransaction.

A transaction-oriented transport pro-tocol attempts to provide reliable ser-



vice for transactions with as few TPDUsas possible, ideally only one for the re-quest and one for the response. This isdone by trying to minimize overhead forconnection establishment. Transactionsshare the following characteristics [Bra-den 1992a]: an asymmetrical model (i.e.,client and server), simplex data trans-fer, short duration, low delay, few dataTPDUs, message orientation, and theneed for no-duplicates service. Exam-ples of transaction-oriented protocols in-clude VMTP and T/TCP, a backwardscompatible TCP extension.

4.3 CO Protocol Features

The following subsections refer only toCO protocols.

4.3.1 Signaling. Signaling is the ex-change of control (i.e., state) informa-tion between transport entities for man-aging a connection.5 Signaling is used toestablish and terminate a connectionand to exchange communication systemparameters. Signaling can be accom-plished in-band, where control informa-tion and user data are multiplexed onthe same connection, or out-of-band,where separate connections are used.In-band signaling (e.g., TCP) is gener-ally more suitable for applications thatrequire short-lived connections (e.g.,transaction-oriented communications).The extra overhead incurred for manag-ing two bands is avoided, and smallamounts of user data can be transmit-ted during the signaling operation.

Out-of-band signaling (e.g., TP11) isdesirable for high-speed communicationsystems. It avoids the overhead of sepa-rating signaling information from userdata by doing so below the transportlayer. Out-of-band signaling can sup-port transporting more than one kind ofdata over a connection, thereby facili-tating operations involving third par-ties, for example, a server validatingsecurity information, or billing.

4.3.2 Unidirectional vs. Bidirectional.In a unidirectional connection (e.g.,NETBLT, VMTP) data flows only in onedirection at a time (half duplex). Thatis, while one transport entity is sendingdata, the other may not send data in thereverse direction. Each transport entitycan assume the role of transport senderor transport receiver, but not both si-multaneously.

In a bidirectional connection (e.g.,TCP), both transport entities simulta-neously assume the roles of transportsender and transport receiver, therebyallowing full duplex data flow.

4.3.3 Connection Establishment.Three modes of connection establish-ment are shown in Figure 3. The costassociated with connection establish-ment can be amortized over a connec-tion’s lifetime. For short-lived connec-tions which result from transaction-oriented applications, this cost can besignificant. Therefore, protocols forshort-lived connections have been de-signed using a timer-based connectionestablishment and termination mecha-nism. For longer connections and whenavoiding false connections is important,2-way or 3-way handshake mechanismsare needed. A handshake involves theexplicit exchange of control TPDUs.

In implicit connect, the connection isopen as soon as the first TPDU is sentor received. The transport sender startstransmitting data without any explicitconnection verification by the transportreceiver. Further data TPDUs can besent without receiving an ACK from thetransport receiver. Implicit connectionscan provide reliable service only if theprotocol definition guarantees that de-layed TPDUs from any previouslyclosed connection cannot cause a falseopen. T/TCP [Braden 1994] achievesthis via connection-unique identifiersand timer-based connection termina-tion.

In 2-way-handshake connect (e.g.,APPN, SSCOP/AAL5), a CR-TPDU(Connection Request) and a CC-TPDU(Connection Confirm) are exchanged to

5While in some cases data may be exchanged,signaling generally refers to exchanging controlinfo.



establish the connection. The transportsender may transmit data in the CR-TPDU, but sending additional data isprohibited until the CC-TPDU is re-ceived. During this handshake, QoS pa-rameters such as buffer size, burst size,burst rate, etc., can be negotiated.When the underlying network serviceprovides a small degree of loss, a 2-way-handshake mechanism may be goodenough to establish new connectionswithout significant risk of false connec-tions.

In 3-way-handshake connect (e.g.,TCP), the transport sender sends a CR-TPDU to the transport receiver, whichresponds with a CC-TPDU. The proce-dure is completed with an ACK-CC-TPDU (ACK for Connection Confirm).Normally, no user data is carried onthese connection establishment TP-DUs.6 If the underlying network serviceprovides an unacceptable degree of loss,a 3-way-handshake is needed to preventfalse connections that might result fromdelayed TPDUs. Because TCP was de-signed specifically for use over an unre-liable network service, TCP uses a3-way handshake.

4.3.4 Connection Termination. Fourmodes of connection termination areshown in Figure 4.

With implicit disconnect (e.g., VMTP),when a transport entity does not hearfrom its peer for a certain time period,the entity terminates the connection.Typically, implicit disconnection is usedwith implicit connection establishment.

In abortive disconnect, when a trans-port entity must close the connectionabnormally due to an error condition, itsimply sends an abort-TPDU and termi-nates the connection. The entity doesnot wait for a response. Thus TPDUs intransit in either direction may be lost.

In 2-way-handshake disconnect (e.g.,NETBLT), a transport entity sends aDR-TPDU (Disconnect Request) to itspeer and receives a DC-TPDU (Discon-nect Confirm) in return. If a connectionis unidirectional (see Section 4), thetransport sender usually initiates con-nection termination, and before discard-ing its state information verifies thereception of all TPDUs by the transportreceiver. In a bidirectional connection, a2-way-handshake can only verify recep-tion of TPDUs in one direction. TPDUsin transit in the reverse direction maybe lost.

Finally, in 4(3)-way-handshake dis-connect (e.g., TCP), two 2-way-hand-shakes are used, one for each directionof data flow. The transport entity thatcloses its sending flow sends a DR-

6TCP allows user data to be sent on a CR-TPDU.However, that data cannot be delivered to theuser until the 3-way-handshake is completed.Furthermore, outside of T/TCP compliant imple-mentations, TCP implementations typically do nottake advantage of this feature [Stevens 1996].

Host A Host B

TPDU

Implicit

Host A Host B

CR-TPDU

2-way-handshake

CC-TPDU

Host A Host B

CR-TPDU

3-way-handshake

CC-TPDU

ACK-CC-TPDU

Point in time at which connection is said to be established for that host.

Figure 3. Three modes of connection establishment.



TPDU to its peer entity (in TCP the FINflag is set in its last TPDU). This dis-connect request is then acknowledgedby the transport receiver as soon as allpreceding TPDUs have been received.The connection is terminated when dataflows in both directions are closed. Thenumber of control TPDUs exchangedcan be reduced to three if the first DC-TPDU also functions as a DR-TPDU forthe reverse direction.

Connections may be terminated ei-ther gracefully or ungracefully. In anungraceful termination as in TP0 orTP4,7some data in transit may be lost.In a graceful close, no data in transit islost since a connection is closed onlyafter all data have arrived at their des-tination. For bidirectional connections,only a 4(3)-way-handshake can guaran-tee graceful close.

4.4 Error Control: Sequence Numbers,Acks, and Retransmissions

Error control is a combination of tech-niques to guard against loss or damageof user data and control information. CLnetwork protocols such as IP perform noerror control on user data. Even COnetworks are being designed with light-weight virtual circuits that provide noerror control [McAuley 1990; Parulkarand Turner 1989]. Recent studies haveshown that for realistic high-speed net-works with low error rates, transportlayer error control is more efficient than

link layer error control [Bae et al. 1991;Bhargava et al. 1988]. There are twophases in transport layer error control:error detection, and error reporting andrecovery.

4.4.1 Error Detection. Error detec-tion identifies lost, misordered, dupli-cated, and corrupted TPDUs. Sequencenumbers help uncover the first threeproblems. Corrupted data is discoveredby means of (1) length fields that allowa transport receiver to detect missing orextra data, and (2) redundant informa-tion in the form of error detecting codes(EDC). Cyclic redundancy checks andother forms of checksums are the mostcommon examples of EDCs. An EDCcan verify the header/trailer, the data,or both. TCP and AAL5 contain a singlechecksum on both. XTP performs onechecksum on the header and allows foran optional second checksum on thedata.

Separate EDCs for header and dataare recommended for multimedia appli-cations [La Porta and Schwartz 1992].In voice applications, where corrupteddata is tolerable but retransmissionsare less advantageous due to delay con-straints, only a header EDC needs to beused. In image transfer using bothEDCs, corrupted data may be used tem-porarily until a corrected version is ob-tained via retransmission.

4.4.2 Error Reporting and Recovery.Error reporting is a mechanism wherethe transport receiver explicitly informsthe transport sender about errors that

7The OSI Session Layer provides the gracefulclose.

Host A Host B

Abort-TPDU

Abortive

Host A Host B

DR-TPDU

2-way-handshake

DC-TPDU

Host A Host B

DR-TPDU

4(3)-way-handshake

DR-TPDU

DC-TPDU

Point in time at which connection is said to be terminated for that host.

Host A Host B

Implicit

No dataflow

DC-TPDU

Figure 4. Four modes of connection termination.



have been detected. Error recovery is amechanism used by both transportsender and receiver to recover from er-rors whether or not they are explicitlyreported.

Error reporting and recovery are tra-ditionally accomplished using timers,sequence numbers, and acknowledg-ments. Sequence numbers are assignedto TPDUs (or associated with the byte-stream). The transport receiver informsthe transport sender via ACKs aboutTPDU arrivals so that the sender canretransmit those that are missing.

A positive ACK, or PACK, containssequence number information aboutthose TPDUs that have been received. Anegative ACK, or NACK, often known asa selective reject, explicitly identifiesTPDUs that have not been received.8

Protocols that use ACKs are known asPositive Ack with Retransmission (PAR)or Automatic Repeat reQuest (ARQ)schemes. Upon receipt of an ACK, thetransport sender updates its state infor-mation, discards buffered TPDUs thatare acknowledged, and retransmits anyTPDUs that are not acknowledged. Aprotocol that only use PACKs has noerror reporting mechanism.

If a transport sender does not receivean ACK within a reasonable timeoutperiod, it may assume something hasgone wrong and retransmits unacknowl-edged TPDU(s). With delays withinmost underlying networks being vari-able and dynamic, accurate calculationof a reasonable timeout value within thetransport layer is a most difficult prob-lem. However, accurate round-trip time(RTT) estimation is essential [Zhang1986]. The basic idea is to keep an RTTestimate by measuring the time be-tween sending a TPDU and receiving itsACK. Research has shown the impor-tance of not updating this estimate forretransmitted TPDUs, since it is impos-sible to determine whether the receivedACK is for the original TPDU transmis-

sion or a subsequent retransmission[Karn and Partridge 1987]. Jacobson[1988]refined TCP’s retransmissiontimeout calculation by using a low-passfilter to estimate the mean, and meandeviation of the RTT. RTT estimationhas been a key area of research in TCP;an informative summary appears inStevens [1994].

4.4.3 Piggybacking. When a TPDUarrives at a transport receiver, insteadof immediately returning an ACK as aseparate control TPDU, some protocols(including TCP) artificially delay re-turning an ACK hoping the user re-ceiver will soon submit its next messageto be sent as part of the reverse direc-tion data flow. When this occurs, theACK is piggyback-ed as header informa-tion on the reverse direction dataTPDU. This is an example of transportlayer concatenation (see Section 4).

4.4.4 Cumulative vs. Selective Ac-knowledgment. A common type ofPACK is the cumulative PACK. Eachcumulative PACK carries a sequencenumber indicating that all TPDUs withlower9 sequence numbers have been re-ceived. With cumulative PACKs, even ifPACKi is lost, the arrival of a laterPACKj such that j . i, positively ac-knowledges all TPDUs up to TPDUj.The later cumulative PACK incorpo-rates the information of the previouslylost one.10

A disadvantage of cumulative PACKsoccurs when TPDUi is lost andTPDUi11, TPDUi12, . . . , arrive intactand are buffered. Each PACK followingthe loss can acknowledge only TPDUsup to and including TPDUi21. Thus, thetransport sender may not receive timelyfeedback on the success of TPDUs sent

8For clarity in this paper, PACK refers only to apositive ACK, and ACK refers to either a PACK orNACK.

9This is TCP’s definition; some protocols use “low-er or equal”.10Most implementations allow sequence numberwraparound. For simplicity, we assume that j . iimplies that TPDUj is later in the data flow thanTPDUi.



after TPDUi, and retransmit them un-necessarily.

The second PACK type in its mostbasic form is the selective PACK, whichacknowledges exactly one TPDU. Withselective PACKs, a transport receiverinforms the transport sender about eachTPDU that arrives successfully, so thetransport sender can more likely re-transmit only those TPDUs that haveactually been lost. Selective PACKs areused by NETBLT and VMTP.

A block PACK is a variation of selec-tive PACK, where blocks of individualTPDUs are selectively acknowledged[Brown et al. 1989]. For example, anXTP ACK is actually a series of blockPACKs. If TPDUs {1, 2, 4, 5, 6, 9, 10}arrive and {3, 7, 8} are missing, then theblock PACK would look like 1-2;4-6;9-10.

In part due to the influence of innova-tive acknowledgment schemes in proto-cols such as XTP, NETBLT, and VMTP,TCP with selective acknowledgment(called SACK) recently has been pro-posed. SACK combines selective (block)and cumulative PACKs within TCP[Braden and Jacobson 1988; Floyd 1996;Mathis et al. 1996]. One simulationstudy shows the strength of TCP imple-mentations with vs. without selectivePACKs [Fall and Floyd 1996].

4.4.5 Retransmission Strategies.When a transport sender transmitsTPDUi and determines later that itmay not have arrived at the transportreceiver, two retransmission strategiesare possible. (This determination canresult when the transport sender doesnot receive a PACK within a predeter-mined timeout period, or when it re-ceives back-to-back cumulative PACKsthat are identical.) A conservative ap-proach has the transport sender re-transmit selectively only TPDUi andwait for a PACK with sequence numberlarger than previous PACKs. This selec-tive repeat approach is used in SNR[Doshi et al. 1993], and avoids retrans-mitting correctly received TPDUs [Feld-meier and Biersack 1990].

A more aggressive transport senderretransmits TPDUi and all TPDUs al-ready sent after TPDUi. This Go-Back-N approach is fairly simple butdecreases channel utilization by poten-tially retransmitting correctly-receivedTPDUs. These unnecessary retransmis-sions then add to network congestion.The protocol SMART combines good fea-tures of selective repeat and Go-Back-Nby having the transport receiver returnboth a cumulative PACK, and a selec-tive PACK for the TPDU that most re-cently arrived [Keshav and Morgan1997]. This combined ACK informationhelps the transport sender avoid unnec-essary retransmissions. Further workon innovative retransmission strategiesfor use over wireless networks can befound in Balakrishnan et al. [1996] (seeSection 6). Both selective repeat andGo-Back-N also apply to data link layercommunication, and are thoroughly an-alyzed in Stallings [1997], Tanenbaum[1996], and Walrand [1991].

4.4.6 Sender-dependent vs. Sender-in-dependent Acknowledgment. In someprotocols, transport receiver ACKs aregenerated in response to explicit or im-plicit events initiated by the transportsender. These ACKs are called sender-dependent [Doeringer et al. 1990; Thaiet al. 1994]. In XTP, no ACKs are gen-erated until the transport sender explic-itly instructs the transport receiver togenerate one. Some transaction-ori-ented protocols such as VMTP try tominimize the number of control TPDUsexchanged by using the response as animplicit PACK for a transmitted re-quest, and a new request as an implicitPACK for the previous response.

Sender-independent ACKs are thosegenerated by a transport receiver inde-pendent of actions by the transportsender. SNR/ESNR [Doshi et al. 1993;Netravali et al. 1990] uses an error re-porting and recovery scheme called Pe-riodic State Exchange, where completestate information is exchanged periodi-cally between a transport sender and atransport receiver based on timers, not



the arrival or loss of TPDUs. If a controlTPDU containing state information islost, no immediate steps are taken.Eventually, after another period thecomplete state information containinginformation in the lost control TPDUwill be exchanged. This approach sim-plifies the error recovery mechanismand reduces the number of error recov-ery timers. With parallel processors,one processor at the receiver can bededicated to processing state informa-tion and inserting control TPDUs whilea second focuses on processing TPDUs.

4.4.7 Duplicate Detection. Sequencenumbers are also the basic mechanismfor duplicate detection. If an ACK is lostand as a result one or more TPDUs areretransmitted, duplicate TPDUs can ar-rive at the transport receiver. In mostCO protocols, the transport receivermaintains state information about pre-viously received TPDUs. (A less com-mon approach to duplicate detectionuses synchronized clocks instead ofstate information [Liskov et al. 1990]).When a duplicate TPDU is received, thestate information allows it to be de-tected. In CL protocols, where no stateinformation is maintained, duplicate de-tection is not possible and the typicaloffered service is maybe-loss, maybe-du-plicates. In CL protocols that use ACKsand retransmissions, the offered serviceis limited to no-loss, maybe-duplicatesservice.

When a duplicate is received after aconnection close and subsequent recon-nection between the same two transportentities, duplicate detection becomes adelicate problem. In this case, the dupli-cate may accidentally be consideredoriginal data for the second connection.Several solutions exist for this delayedduplicate problem. First, each transportentity can remember the last sequencenumber used for each terminated con-nection (i.e., maintain state even afterdisconnection). If a connection is re-es-tablished, the next sequence number isused. A design problem involves decid-ing how long each transport entity must

maintain the state information. A sec-ond approach uses a unique connectionidentifier for each connection. Both ap-proaches work fine, unless the systemloses the sequence number or connec-tion identifier information, say due to asystem crash. A third approach requiresa minimum amount of time before aconnection between the same two trans-port entities can be reestablished, andenforces a maximum lifetime of anyTPDU to be less than that minimum.This so-called TIME-WAIT approach isused in TCP; it has the disadvantage ofintroducing artificial delays betweenconsecutive connections.

4.4.8 Forward Error Correction. Be-cause each retransmission requires atleast one round-trip time, PAR andARQ schemes may operate poorly forapplications that have tight latency con-straints. As an alternative, Forward Er-ror Correction (FEC) can be used for lowlatency error control, with or withoutPAR/ARQ. In TP11, a transport sendercan transmit each TSDU as k data TP-DUs and an additional h redundant par-ity TPDUs. Unless the network losesmore than h of the h1k TPDUs sent,the transport receiver can reconstructthe original k data TPDUs.

4.5 Flow/Congestion Control

The terms flow control and congestioncontrol create confusion, since differentauthors approach these subjects fromdifferent perspectives. In this sectionwe try to make a useful distinction be-tween the terms while recognizing thatoverlap exists.

First, we define transport layer flowcontrol as any scheme by which thetransport sender limits the rate atwhich data is sent over the network.The goals of flow control may includeone or both of the following: (1) prevent-ing a transport sender from sendingdata for which there is no availablebuffer space at the transport receiver,or (2) preventing too much traffic in theunderlying network. Flow control for (2)



also is called congestion control, or con-gestion avoidance. Congestion is essen-tially a network layer problem, and doz-ens of schemes are discussed andclassified in Yang and Reddy [1995].However, congestion often is addressedby transport layer flow control (alsoknown as end-point flow control).

Techniques used to implement bothgoals are often tightly integrated, as isthe case in TCP. Regardless of whichgoal is being pursued, the flow of data isbeing controlled. Therefore, some au-thors use “flow control” to refer to proto-col features that address both goals,thus blurring the distinction betweenflow and congestion control [Stevens1994, p. 310]. Bertsekas and Gallager[1992] state that flow and congestioncontrol overlap so much that it is betterto treat them as a single problem; how-ever, he discusses flow control as a ma-jor network layer function. Other au-thors [Stallings 1997; Tanenbaum 1996;Walrand 1991] clearly separate conges-tion control from flow control. Feld-meier states that congestion control canbe viewed as a generalized n-dimen-sional version of flow control, and sug-gests that flow control for all layersshould be performed at the lowest layerthat requires flow control [Feldmeier1993a]. In this paper, we use flow con-trol as the overall term for such tech-niques, but emphasize that (1) flow con-trol is only one example of manypossible congestion control techniques,and (2) congestion control is only one ofthe two motivations for flow control.

We now present a discussion of gen-eral flow control techniques, and a dis-cussion of how these techniques combatnetwork congestion.

4.5.1 General Flow Control Tech-niques. Transport layer flow control ismore complex than network or data-linklayer flow control. This is because stor-age and forwarding at intermediaterouters causes highly variable delaysthat are generally longer than actualtransmission time [Stallings 1997].Transport layer flow control usually is

provided in tandem with error control,by using sequence numbers and win-dowing techniques. Some authors, how-ever, claim that error control and flowcontrol have different goals and shouldbe separated in high-speed network pro-tocol design [Feldmeier 1990; La Portaand Schwartz 1992; Lundy and Tipici1994].

Two techniques may be used, eitheralone or together, to avoid network con-gestion and overflowing receiver buff-ers. These are: (1) window flow controland (2) rate control.

In window (or sliding-window) flowcontrol, the transport sender continuessending new data as long as there re-mains space in the sending window. Ingeneral, this window may be fixed orvariable in size. In fixed size windowcontrol, ACKs from the transport re-ceiver are used to advance the transportsender’s window.

In variable size window control, alsocalled a credit scheme, ACKs are de-coupled from flow control. A TPDU maybe acknowledged without granting thetransport sender additional credit tosend new TPDUs, and additional creditcan be given without acknowledging aTPDU. The transport receiver adopts apolicy concerning the amount of data itpermits the transport sender to trans-mit, and advertises the size of this win-dow in TPDUs that flow from receiver tosender.

Early experience with TCP showed aproblem, known as the Silly WindowSyndrome, that can afflict protocols us-ing the credit scheme. The silly windowsyndrome can start either because thetransport receiver advertises a verysmall window, or the transport sendersends a very small TPDU. Clark [1982]describes how this can degenerate into avicious cycle where the average TPDUsize ends up being much smaller thanthe optimal case, and throughput suf-fers as a result. TCP includes mecha-nisms in both the transport sender andtransport receiver to avoid the condi-tions that lead to this problem.



Credit schemes may be described asconservative or aggressive (optimistic).The conservative approach, as used inTCP, only allows new TPDUs up to thetransport receiver’s available bufferspace. This approach has the disadvan-tage that it may limit a transport con-nection’s throughput in long delay orlarge bandwidth-product situations.The aggressive approach attempts to in-crease throughput by allowing a trans-port receiver to optimistically grantcredit for space it does not have [Stall-ings 1997]. A disadvantage of the ag-gressive approach is its potential forbuffer overflow at the transport receiver(hence, wasted bandwidth) unless thecredit-granting mechanism is carefullytimed [Clark 1982].

Rate control uses timers at the trans-port sender to limit data transmission[Cheriton 1988; Clark et al. 1987;Weaver 1994]. Either a transportsender can be assigned (1) a burst sizeand interval (or burst rate), or (2) aninterpacket delay time. In (1), a trans-port sender transmits a burst of TPDUsat its maximum rate, then waits for thespecified burst interval before transmit-ting the next burst (e.g., NETBLT andXTP). This is often modeled as a token-bucket scheme. In (2), a transportsender transmits data as long as it hascredit available, but artificially pausesbetween each TPDU according to theinterpacket delay (e.g., VMTP [Cheritonand Williamson 1989]). This is oftenmodeled as a leaky-bucket scheme. Case(2) improves performance becausespreading TPDU transmissions helpsavoid network congestion and receiveroverflow. However, interpacket delay al-gorithms are more difficult to imple-ment because of the timers needed.

Rate control schemes involve low net-work overhead since they do not ex-change control TPDUs except to occa-sionally adjust the rate in response to asignificant change in the network or thetransport receiver. They also do not af-fect the way data ACKs are managed[Doeringer et al. 1990]. Some authorsargue that implementing window con-

trol and rate control together as in XTPand NETBLT will achieve better perfor-mance than implementing only one ofthem [Clark et al. 1987; Doeringer et al.1990; La Porta and Schwartz 1992;Sanders and Weaver 1990].

One innovative approach imple-mented in TP11 involves backpressure,where the application, transport, andnetwork layers all interact. If a userreceiver reads from the transport re-ceiver at a rate slower than the trans-port sender is sending, the transportreceiver’s buffers eventually will fill up.The transport receiver explicitly trig-gers congestion control procedureswithin the network. The network senderin turn refuses additional TPDUs fromthe transport sender. The transportsender may then exercise backpressureon the sending application by refusingto accept any more new APDUs, untilnotice arrives that the transport receiv-er’s buffers have started to empty[Stevens 1994]. This method will notwork if multiple transport connectionsare multiplexed on a single networkconnection [Stallings 1997].

4.5.2 Flow Control Techniques for Ad-dressing Congestion. Congestion con-trol schemes have two primary objec-tives: fairness, and optimality.

Fairness involves sharing resourcesamong users (individual data flows)fairly. Since transport entities lackknowledge about general network re-sources, the bottleneck(s) itself is at abetter position to enforce fairness. Thetransport layer’s role is limited to en-suring that all transport senders coop-erate. For example, TCP’s congestioncontrol technique assumes that all TCPentities will “back-off” in a similar man-ner in the presence of congestion. Un-fortunately, it only works in a coopera-tive environment. Today’s Internet isexperiencing congestion difficulties be-cause of many non-TCP-conformanttransport entities acting selfishly. Mod-ifications to Random Early Dropping(RED) algorithms have been proposed to



counter these selfish TCP implementa-tions [Floyd and Jacobson 1993].

Optimality involves ensuring that allrouters operate at optimal levels. Trans-port entities have a role to play by lim-iting the traffic sent over any routerthat becomes a bottleneck. A router canmonitor itself and send explicit accesscontrol feedback to the traffic sources[Prue and Postel 1987; Ramakrishnanand Jain 1988]. Alternatively, transportentities can use implicit access controland detect potential bottlenecks fromtimeouts and delays derived from theACKs received [Ahn et al. 1995; Jain1989]. A third method is to use packet-pair flow control which also has theadvantage of ensuring that well-be-haved users are protected from ill-be-haved users because of the use of round-robin schedulers [Keshav 1991; 2000].

Explicit access control has been im-plemented in XTP and TP11. XTP im-plements both end-to-end flow controland explicit access congestion control.Initially, default parameters control thetransport sender’s transmission rate.However, during the exchange of controlinformation, the network can modifythe flow control parameters to controlcongestion. Furthermore, network rout-ers can explicitly send control PDUs totransport entities to control the conges-tion. TP11 uses the previously men-tioned backpressure both to avoid over-whelming a slow receiver and to controlcongestion.

When the network layer provides nosupport for explicit access control, thetransport protocol may operate on theassumption that any TPDU loss indi-cates network congestion — a reason-able assumption when the underlyingnetwork links are highly reliable. TCP’sapproach to congestion avoidance incor-porates two important design princi-ples. The first principle is slow-start,where new connections do not initiallysend at the full available bandwidth,but rather first send one TPDU perRTT, then two, then four, etc., (expo-nential increase per RTT), until the fullbandwidth is reached, or a loss is de-

tected. The second principle is thatwhen loss is encountered, indicating po-tential congestion, the transport senderreduces its sending rate significantly,and then uses an additive increase perRTT in an attempt to find an optimalsending rate. A goal of TCP is thatnetwork bandwidth is shared fairlywhen all TCP connections abide by bothprinciples. Details of these principleshave been refined over various imple-mentations of TCP, which go by thenames Tahoe, Reno, and Vegas [Ahn etal. 1995; Fall and Floyd 1996; Jacobson1988].

Using a control-theoretic approach,congestion control schemes can beviewed as a control policy to achieveprescribed goals (e.g., minimize round-trip delay). These schemes are dividedinto open loop (e.g., rate control andimplicit access control schemes), wherecontrol decisions by the transportsender do not depend on feedback fromthe congested sites, and closed loop(e.g., explicit access control schemes),where control decisions do depend onsuch feedback. With networks havinghigher speeds and higher bandwidth-delay products, open loop schemes havebeen proposed as being more effective.Many current admission controlschemes, which can be called regulationschemes, are open loop.

4.6 Multiplexing/Demultiplexing

Multiplexing, as shown in Figure 5(a),supports several transport layer connec-tions using a single network layer asso-ciation. Multiplexing maps several user/transport interface points (what ISOcalls TSAPs) onto a single transport/network interface point (NSAP). An as-sociation is a virtual circuit in the caseof a CO network. In the case of a CLnetwork, an association is a pair of net-work addresses (source and destina-tion). When the underlying network isCL (as is the case for TCP), multiplex-ing/demultiplexing as provided by TCP’sport numbers is necessary to serve mul-tiple transport users.



Multiplexing uses network layer re-sources more efficiently by reducing thenetwork layer’s context-state informa-tion. Additionally, it can provide primi-tive stream synchronization [Feldmeier1990]. For example, video and audiostreams of a movie can be multiplexedto maintain “lip sync,” in which case themultiplexed streams are handled as asingle stream in a single, uniform man-ner. This method works only if allstreams require the same network layerQoS. Multiplexing’s advantages come atthe expense of having to demultiplex TP-DUs at the transport receiver and to en-sure fair sharing of resources among themultiplexed connections. Demultiplexingrequires a look-up operation to identifythe intended receiver and possibly extraprocess switching and data copying.

4.7 Splitting/Recombining

Splitting (or downward-multiplexing[Walrand 1991]) as shown in Figure 5(b)is the opposite of multiplexing. In thiscase, several network layer associationssupport a single transport layer connec-tion. Splitting provides additional resil-ience against a network failure and po-tentially increases throughput byletting a fast processor output data overseveral slower network connections[Strayer and Weaver 1988].

4.8 Concatenation/Separation

Concatenation, as shown in Figure 6,combines several TPDUs into one net-work service data unit (NSDU), thusreducing the number of NSDUs submit-

ted to the network layer. The transportreceiver has to separate the concate-nated TPDUs. This mechanism savesnetwork bandwidth at the expense ofextra processing time at the transportsender and receiver. Considering thatthe transport receiver is often the bot-tleneck in protocol processing and thatavailable bandwidth is increasing, someauthors argue that concatenation/sepa-ration and splitting/recombining mecha-nisms are becoming less important[Thai et al. 1994].

4.9 Blocking/Unblocking

Blocking combines several TSDUs intoa single TPDU (see Figure 7(a)), therebyreducing the number of transport layerencapsulations and the number of NS-DUs submitted to the network layer. Thereceiving transport entity has to unblockor deblock the blocked TSDUs before de-livering them to transport service user.

Blocking/unblocking is intended tosave network bandwidth. The effect onprocessing time varies with the particu-lar implementation. A protocol that hasto do both blocking and unblocking, sayto preserve TSDU boundaries, mightadd processing time. However, TCP’sform of blocking, described by the NagleAlgorithm [Nagle 1984], actually mayreduce processing time at the transportreceiver, since it reduces the number ofTCP headers which must be processed.TCP’s transport receiver unblocks, butdoes not preserve TSDU boundaries.This is consistent with TCP’s CO-byteservice.

NSAPTSAPs

Flow of NPDUs

Application(Session)

Transport

Network

Host A Host B

... ...

NSAPsTSAP

Flows of NPDUs

Host A Host B

... ...

(a) (b)

Figure 5. (a) Multiplexing/demultiplexing(b) Splitting/recombining.



Although blocking and concatenationare similar (both permit grouping ofPDUs), they may serve different pur-poses. As one subtle difference, concate-nation permits the transport layer togroup control (e.g., ACK) TPDUs withdata TPDUs that were derived from TS-DUs. Piggybacking ACKs and data is anexample of concatenation. Blocking onlycombines TSDUs (i.e., transport layerdata). Segmentation and concatenationmay occur, but blocking, as definedwithin the ISO Reference Model, in CLprotocols is not permitted [ITU-T 1994b].

4.10 Segmentation/Reassembly

Often the network service imposes amaximum permitted NSDU size. Insuch cases, a larger TSDU is segmentedinto smaller TPDUs by the transportsender as shown in Figure 7(b). Thetransport receiver reassembles the TSDUbefore delivering it to the user receiver.

TCP segmentation is based on negoti-ating a maximum segment size duringconnection establishment, or in morerecent implementations, on a Path Max-imum Transmission Unit (MTU) discov-ery mechanism. To avoid the overheadassociated with network layer fragmen-tation, many TCP implementations de-termine the largest size IP datagramthat can be transmitted end-to-endwithout IP fragmentation [Kent and

Mogul 1987]. These implementationsthen use transport layer segmentation(and/or blocking) with this MTU inmind, thereby reducing11 the risk fornetwork layer segmentation [Stevens1994, p. 340].

5. TRANSPORT PROTOCOL EXAMPLES

This section now summarizes elevenwidely implemented or particularly in-fluential transport protocols other thanTCP. Each one’s features are describedin terms of the service and protocol fea-tures discussed in general in Sections 3and 4, respectively. The protocol de-scriptions are summarized from the in-dicated references and from direct cor-respondence with those people mostresponsible for each protocol’s designand development. Together these proto-cols are summarized in the Appendix inTables II–IV. Each of these tables aredivided into three parts: (a) general fea-tures, (b) service features, and (c) proto-col features.

The last part of the section provides abrief statement about eight experimen-tal transport protocols that appear inthe literature.

11Due to dynamic route changes, the MTU canchange with TCP; thus IP fragmentation can beavoided but not prevented.

NSDU

TPDU TPDU

Sender Receiver

TransportLayer

NSDU

TPDU TPDU

NetworkLayer

... ...

Figure 6. Concatenation and separation.

Sender Receiver

TransportLayer TPDU

TSDU TSDU TSDU

TPDU TPDU

Sender Receiver

...

TPDU

TSDU TSDU...

...

TSDU

TPDU TPDU...

(a) (b)

Figure 7. (a) Blocking/unblocking (b) Segmentation/reassembly.



5.1 UDP

User Datagram Protocol (UDP) (see Ta-ble II) provides CL, uncontrolled-loss,maybe-duplicates, unordered service[Postel 1980]. UDP is basically an inter-face to IP, adding little more than TSAPdemultiplexing (port numbers) and op-tional data integrity service. By not pro-viding reliability service, UDP’s over-head is significantly less than that ofTCP. Although UDP includes an optionalchecksum, there is no provision for errorreporting; incoming TPDUs with check-sum errors are discarded, and valid onesare passed to the user receiver.

5.2 TP4

The ISO Transport Protocol Class 4 (orTP4) (see Table II) was designed for thesame reasons as TCP. It provides a sim-ilar CO, reliable service over a CL unre-liable network. TP4 relies on the samemechanisms as TCP with the followingdifferences. First, TP4 provides a CO-message service rather than CO-byte.Therefore, sequence numbers enumer-ate TPDUs rather than bytes. Next, se-quence numbers are not initiated from aclock counter as in TCP [Stevens 1994],but rather start from 0. A destinationreference number is used to distinguishbetween connections. This number issimilar to the destination port numberin TCP, but here the reference numbermaps onto the port number and can bechosen randomly or sequentially [Bert-sekas and Gallager 1992]. Another im-portant difference is that (at least intheory) a set of QoS parameters (seeSection 3) can be negotiated for a TP4connection. Other differences betweenTCP and TP4 are discussed in Piscitelloand Chapin [1993].

5.3 TP0

The ISO Transport Protocol Class 0 (orTP0) (see Table II) was designed as aminimum transport protocol providingonly those functions necessary to estab-lish a connection, transfer data, andreport protocol errors. TP0 was de-

signed to operate on top of a CO reliablenetwork service that also provides end-to-end flow control. TP0 does not evenprovide its own disconnection proce-dures; when the underlying networkconnection closes, TP0 closes with it.One interesting use of TP0 is that it canbe employed to create an OSI transportservice on TCP’s reliable byte-streamservice, enabling OSI applications torun over TCP/IP networks [Rose andCass 1987].

5.4 NETBLT

Network Block Transfer (NETBLT) (seeTable III) was developed at MIT forhigh throughput bulk data transfer[Clark et al. 1987]. It is optimized tooperate efficiently over long-delay links.NETBLT was designed originally to op-erate on top of IP, but can operate ontop of any network protocol that pro-vides a similar CL unreliable networkservice. Data exchange is realized viaunidirectional connections. The unit oftransmission is a buffer, several ofwhich can be concurrently active tokeep data flowing at a constant rate.Connection is established via a 2-way-handshake during which buffer, TPDUand burst sizes are negotiated. Flowcontrol is accomplished using buffers(transport-user-level control) and ratecontrol (transport-protocol-level con-trol). Either transport user of a connec-tion can limit the flow of data by notproviding a buffer. Additionally, NET-BLT uses burst size and burst rate pa-rameters to accomplish rate control.NETBLT uses selective retransmissionfor error recovery. After a transportsender has transmitted a whole buffer,it waits for a control TPDU from thetransport receiver. This TPDU can be aRESEND, indicating lost TPDUs, or anOK, acknowledging the whole buffer. AGO allows the transmission of anotherbuffer. Instead of waiting after eachbuffer, a multiple buffering mechanismcan be used [Dupuy et al. 1992].



5.5 VMTP

The Versatile Message Transaction Pro-tocol (VMTP) (see Table III) was de-signed at Stanford University to providehigh performance communication ser-vice for distributed operating systems,including file access, remote procedurecalls (RPC), real-time datagrams andmulticast [Cheriton and Williamson1989]. VMTP is a request-response pro-tocol that uses timer-based connectionmanagement to provide communicationbetween network-visible entities. Eachentity has a 64-bit identifier that isunique, stable, and independent of host-address. The latter property allows enti-ties to be migrated and handled inde-pendent of network layer addressing,facilitating process migration and mo-bile and multihomed hosts [Cheritonand Williamson 1989]. Each request(and response) is identified by a trans-action identifier. In the common case, aclient increments its transaction identi-fier and sends a request to a singleserver; the server sends back a responsewith the same (Client, Transaction)identifier. A response implicitly ac-knowledges the request, and each newrequest implicitly acknowledges the lastresponse sent to this client by theserver. Multicast is realized by sendingto a group of servers. Datagram supportis provided by indicating in the requestthat no response is expected. Addition-ally, VMTP provides a streaming modein which an entity can issue a stream ofrequests, receiving the responses backasynchronously [Williamson and Cheri-ton 1989]. Flow control is achieved by arate control scheme borrowed fromNETBLT with negotiated interpacketdelay time.

5.6 T/TCP

Transaction TCP (T/TCP) (see Table III)is a backwards-compatible extension ofTCP that provides efficient transaction-oriented service in addition to CO ser-vice [Braden 1992a; 1994]. The goal ofT/TCP is to allow each transaction to be

efficiently performed as a single incar-nation of a TCP connection. It intro-duces two major improvements overTCP. First, after an initial transactionis handled using a 3-way-handshakeconnection, subsequent transactionsstreamline connection establishmentthrough the use of a 32-bit incarnationnumber, called a “connection count”(CC) carried in each TPDU. T/TCP usesthe monotonically increasing CC valuesin initial CR-TPDUs to bypass the3-way-handshake, using a mechanismcalled TCP Accelerated Open. With thismechanism, a transport entity needs tocache a small amount of state for eachremote peer entity. The second improve-ment is that T/TCP shortens the delayin the TIME-WAIT12 state. T/TCP de-fines three new TCP options, each ofwhich carries one 32-bit CC value.These options accelerate connection setupfor transactions. T/TCP includes all nor-mal TCP semantics, and operates exactlyas TCP for all features other than connec-tion establishment and termination.

5.7 RTP

Real-time Transport Protocol (RTP) (seeTable III) was designed for real-timemultiparticipant multimedia applica-tions [Schulzrinne 1996; Schulzrinne etal. 1996]. Even though RTP is called atransport protocol by its designers, thissometimes creates confusion, becauseRTP by itself does not provide a com-plete transport service. RTP TPDUsmust be encapsulated within the TP-DUs of another transport protocol thatprovides framing, checksums, and end-to-end delivery, such as UDP. The maintransport layer functions performed by

12When a TCP entity performs an active close andsends the final ACK, that entity must remain inthe TIME-WAIT state for twice the MaximumSegment Lifetime (MSL), the maximum time anyTPDU can exist in the network before being dis-carded. This allows time for the other TCP entityto send and resend its final ACK in case the firstcopy is lost. This closing scheme prevents TPDUsof a closed connection from appearing in a subse-quent connection between the same pair of TCPentities [Braden 1992b].



RTP are to provide timestamps and se-quence numbers for TSDUs. Thesetimestamps and sequence numbers maybe used by an application written on topof RTP to provide error detection, rese-quencing of out-of-order data, and/or er-ror recovery. However, it should be em-phasized that RTP itself does notprovide any form of error detection orerror control. Furthermore, in additionto transport layer functions, RTP alsoincorporates presentation layer func-tions; through the use of so-called RTPprofiles, RTP provides a means for theapplication to identify the format of data(i.e., whether the data is audio or video,what compression method is used, etc.).

RTP has no notion of a connection; itmay operate over either CO or CL ser-vice. Framing and segmentation mustbe done by the underlying transportlayer. RTP is typically implemented aspart of the application and not in theoperating system kernel; it is a goodexample of Application Level Framingand Integrated Layer Processing [Clarkand Tennenhouse 1990]. It consists oftwo parts: data and control. Continuousmedia data such as audio and video iscarried in RTP data TPDUs. Controlinformation is carried in RTCP (RTPControl Protocol) TPDUs. Control TP-DUs are multicast periodically to thesame multicast group as data TPDUs.The functions of RTCP can be summa-rized as QoS monitoring, intermediasynchronization, identification, and ses-sion size estimation/scaling. The controltraffic load is scaled with the data traf-fic load so that it makes up a certainpercentage of the data rate (5%).

RTP and RTCP were designed underthe auspices of the Internet Engineer-ing Task Force (IETF).

5.8 APPN (SNA)

The first version of IBM’s proprietarySystems Network Architecture (SNA)(see Table IV) was implemented in 1974in a hierarchical, host-centric manner:the intelligence and control resided atthe host, which was connected to dumb

terminals. The growing popularity ofLAN-based systems led to SNA version2, a dynamic peer-to-peer architecturecalled Advanced Peer-to-Peer Network-ing (APPN) [Chiong 1996; Dorling et al.1997]. Version 3, called High Perfor-mance Routing (HPR), is a small butpowerful extension to APPN [Dorling1997; Freeman 1995]. HPR includesAPPN-based high-performance routing,which addresses emerging high-speeddigital transmissions. Although SNA isa proprietary architecture, its structurehas moved closer to TCP/IP with eachrevision. Today, SNA exists mostly forlegacy reasons. SNA and TCP/IP are themost widely used protocols in the enter-prise networking arena [Chiong 1996].

APPN does not map nicely onto ISO’smodular layering since it does not de-fine an explicit transport protocol perse. However, end-to-end transport layerfunctions such as end-to-end connectionmanagement are performed as part ofAPPN. Its CO service is built on top of areliable network service. Therefore,many functions related to error controlthat are commonly handled at thetransport layer are not required in AP-PN’s transport layer. Unlike applica-tions using TCP, SNA applications donot directly interface to the transportlayer. Instead, applications are based ona “Logical Unit” concept which containsthe services of all three layers abovenetwork layer. An interesting feature ofAPPN is its Class of Service(COS)-basedroute selection where the route can bechosen based on the associated cost fac-tor. The overall cost for each route de-pends on various factors such as thespeed of the link, cost per connection,cost per transaction, propagation delay,and other user-configurable items suchas security. For service and protocol de-tails of APPN, see Table IV.

HPR, an extension to APPN, en-hances data routing performance by de-creasing intermediate router process-ing. The main components of HPR areAutomatic Network Routing (ANR) andRapid Transport Protocol (RTP) [Chiong1996; Dorling et al. 1997]. Unlike



APPN, RTP is an explicit transport pro-tocol which provides a CO-message, re-liable service. Some of the key featuresof RTP include bidirectional data trans-fer, end-to-end connections without in-termediate session switching, end-to-end flow and congestion control basedon adaptive rate-based (ARB) control,and automatic switching of sessionswithout service disruption.

5.9 NSP (DECnet)

DECnet (see Table IV) is a proprietaryprotocol suite from Digital EquipmentCorporation. Early versions (1974-1982)were limited in functions, providingonly point-to-point links, or networks of# 255 processors. The first version ca-pable of building networks of thousandsof nodes was DECnet Phase IV in 1982(two years earlier than the OSI Refer-ence Model was standardized). Its layercorresponding most closely to the OSITransport layer was called the “EndCommunication Layer”; it consisted of aproprietary protocol called the NetworkServices Protocol (NSP).

NSP provides CO-message and reli-able service. It supports segmentationand reassembly, and has provisions forflow control and congestion control[Robertson 1996]. Two data channelsare provided: a “Normal-Data” channeland an “Other-Data” channel which car-ries expedited data and messages re-lated to flow control. DEC engineersclaim that OSI’s TP4 “is essentially anenhancement and refinement of Digi-tal’s proprietary NSP protocol” [Martinand Leben 1992]. Indeed, there aremany similarities between TP4 andNSP, such as their mechanisms for han-dling expedited data.

In the early 1990s, DECnet Phase Vwas introduced to integrate OSI proto-cols with the earlier proprietary proto-cols. Phase V supports a variant of TP4,plus TP0 and TP2 and NSP for back-wards compatibility. A session level ser-vice known as “tower matching” re-solves which protocol(s) are availablefor a particular connection, and selects

a specific protocol at connection estab-lishment.

DECnet was widely used throughoutthe 1980’s, and remains an importantlegacy protocol for many organizations.DECnet was one of the two target proto-col suites for the original implementa-tion of the popular X11 windowing sys-tem (the other being TCP/IP).

5.10 XTP

The Xpress Transport Protocol’s design(XTP Version 4.013) (see Table IV) wascoordinated within the XTP Forum tosupport a variety of applications such asmultimedia distribution and distributedapplications over WANs as well asLANs [Strayer et al. 1992]. OriginallyXTP was designed to be implemented inVLSI; hence it has a 64-bit alignment, afixed-size header, and fields likely tocontrol a TPDU’s initial processing lo-cated early in the header. However, nohardware implementations were everbuilt. XTP combines classic functions ofTCP, UDP, and TP4, and adds new ser-vices such as transport multicast, multi-cast group management, priorities, rateand burst control, and selectable errorand flow control mechanisms. XTP canoperate on top of network protocols suchas IP or ISO CLNP, data link protocolssuch as 802.2, or directly on top of theAAL of ATM. XTP simply requires fram-ing and end-to-end delivery from theunderlying service. One of XTP’s mostimportant features is the orthogonalityit provides between communication par-adigm, error control and flow control.An XTP user can choose any communi-cation paradigm (CO, CL, or transac-tion-oriented) and whether or not to en-able error control and/or flow control.XTP uses both window-based and rate-based flow control.

13XTP version 3.6 (the Xpress Transfer Protocol)was a transport and network layer protocol com-bined. XTP 4.0 performs only transport layer func-tions.



5.11 SSCOP/AAL5 (ATM)

In the ATM environment, several trans-port protocols have been and are beingdeveloped; they are referred to as ATMAdaptation Layers (AALs).14 TheseAALs have been specified over time tohandle different traffic classes [Stall-ings 1998] - CO constant bit rate (CBR)requiring synchronization, CO variablebit rate (VBR) requiring synchroniza-tion, CO-VBR not needing synchroniza-tion, CL-VBR not requiring synchroni-zation, and most recently available bitrate (ABR) where bandwidth and timingrequirements are defined by the user inan environment where an ATM back-bone provides the same quality of ser-vice as found in a LAN [Black 1995].

Although 5 transport protocols(AAL1-5) have been proposed for thesetraffic classes, AAL5 which supportsvirtually all data applications hasgreatest potential for marketplace suc-cess. None of the AALs including AAL5provide reliable service (although withminor changes AAL5 could).15 Anotherprotocol that does provide end-to-endreliability is the Service Specific Connec-tion-Oriented Protocol (SSCOP) [ITU-T1994a] which can run on top of AAL5.

SSCOP was initially standardized toprovide for reliable communication ofcontrol signals [ITU-T 1995b], not datatransfer. It has since been approved forreliable data transfer with I.365.3[ITU-T 1995a] specifying it as the basisfor providing ISO’s connection-orientedtransport service [ITU-T 1995c]. Some

authors cite SSCOP as potentially appli-cable as a general purpose end-to-endreliable transport protocol in the ATMenvironment [Henderson 1995]. Forthese reasons, we select SSCOP/AAL5as the ATM transport protocol to surveyin Table IV.

SSCOP incorporates a number of im-portant design principles of other high-speed transport protocols (e.g., SNR[Lundy and Tipici 1994]) including theuse of complete state exchange betweentransport sender and receiver to reducereliance on timers. SSCOP’s main draw-back for general usage is that it is de-signed to run over an underlying ATMservice that provides in-order PDU de-livery. The SSCOP receiver assumesthat every out-of-order PDU indicates aloss (as opposed to possible misordering)and immediately returns a NACK(called a USTAT) which acts as a selec-tive reject (see Section 4). Over an unor-dered service, SSCOP would still cor-rectly provide its advertised reliableservice; it is simply unclear whetherSSCOP would be efficient in an environ-ment where the underlying service mi-sorders PDUs. One author indicatesthat SSCOP could be modified to be amore general robust transport protocolcapable of running above an unreliable,connectionless service [Henderson 1995].

5.12 Miscellaneous Transport Protocols

The following are brief descriptions ofeight experimental transport protocolsthat influenced transport protocol de-velopment. While many of the ideas con-tained in these protocols are innovativeand useful, for one reason or another,they have not themselves been success-ful in the marketplace. This may be theresult of: (1) the marketing problem ofintroducing new transport protocols intoan existing infrastructure, (2) the pri-marily university nature of the protocolwithout added industrial support, or (3)the failure of an infrastructure forwhich the protocol has been specificallydesigned.

Although these protocols were surveyed

14Some authors note that since none of the AALsprovide reliable service, then “It is not really clearwhether or not ATM has a transport layer.”[Tanenbaum 1996]. We argue that reliability orthe lack thereof is not the issue; since ATM vir-tual circuits/paths are a concatenation of linksbetween store-and-forward NPDU (cell) switches,then by definition, the AAL protocols above ATMprovide end-to-end transport service.15Because of AAL’s lack of reliable service and thepopularity of TCP/IP, there exists a protocol stackwith TCP over IP over AAL over ATM. This stackessentially puts a transport layer over a networklayer over a transport layer over a network layerto provide reliable data transfer in a mixedTCP/IP - ATM environment [Cole et al. 1996].



as were TCP and the previous eleven,space limitations prevent presentingthem in detail. The tables that summa-rize these protocols are available at: www.eecis.udel.edu/˜amer/PEL/survey/ .

Delta-t was designed in the late 1970sfor reliable stream and transaction-ori-ented communications on top of a besteffort network such as IP or ISO CLNP[Watson 1989]. Its main contribution is inconnection management: it achieves haz-ard-free connection management withoutexchanging explicit control TPDUs.

SNR was designed in the late 1980s fora high speed data communications net-work [Doshi et al. 1993; Lundy and Tipici1994]. Its key ideas are periodic exchangeof state information between the trans-port sender and the transport receiver,reduction of protocol overhead by makingdecisions about blocks of TPDUs insteadof individual ones (packet blocking), andparallel implementation of the protocol ona dedicated front-end communicationsprocessor. SNR uses three modes of oper-ation: one for virtual networks, one forreal-time applications over reliable net-works, and one for large file transfers.

The MultiStream Protocol (MSP), afeature-rich, highly flexible CO trans-port protocol designed in the early1990s, supports applications that re-quire different types of service for dif-ferent portions of their traffic (e.g., mul-timedia). Transport users candynamically change modes of operationduring the life of a connection withoutloss of data [La Porta and Schwartz1993a; 1993b]. MSP defines seven traf-fic streams, each of which is defined bya set of common protocol functions.These functions specify the order inwhich TPDUs will be accepted andpassed to the user receiver.

TP11 was developed in the early1990s. It is intended for a heterogeneousinternetwork with a large bandwidth-de-lay product [Biersack et al. 1992; Feld-meier 1993b]. It uses backpressure forflow and congestion control, and is de-signed to carry three major applicationclasses: constrained latency service,transactions, and bulk data transfer.

DTP, a slightly modified version ofTCP, was developed in the early 1990s[Sanghi and Agrawala 1993]. Some keyfeatures that distinguish DTP from TCPare its use of a send-time controlscheme for flow control, selective as wellas cumulative PACKs, and TPDU-basedsequence numbers.

Partial Order Connection (POC) wasrecently designed to offer a middleground between the ordered/no-loss ser-vice provided by TCP, and the unor-dered/uncontrolled-loss service providedby UDP [Amer et al. 1994]. The maininnovation of POC is the introduction ofpartially-ordered/controlled-loss service[Conrad Ph.D dissertation; 1996; Ma-rasli 1997; Marasli et al. 1997].

Two recent transport protocols, thek-Transmit Protocol (k-XP) and theTimed-Reliable Unordered Message Pro-tocol (TRUMP), provide CO-message,unordered, controlled-loss service. k-XPwas implemented as an application soft-ware library that provides several majorenhancements to the basic service pro-vided by UDP, such as connection man-agement and controlled-loss delivery[Amer et al. 1997]. TRUMP is a varia-tion of k-XP that allows applications tospecify an expiration time for eachTSDU [Golden 1997].

6. FUTURE DIRECTIONS AND CONCLUSION

In this section, we first summarize howseveral recent trends and technologicaldevelopments have impacted transportlayer design. We then cover a particulartrend in more detail; namely, the im-pact of wireless networking on thetransport layer. Next, we enumeratesome of the debates concerning trans-port layer design. We conclude with afew final observations.

6.1 Impacts of Trends and NewTechnologies

Several trends have influenced the de-sign of transport protocols over the lasttwo decades.

Faster satellite links and gigabit net-



works have resulted in networks withlarger end-to-end bandwidth-delayproducts. These so-called “Long FatNetworks” allow increased amounts ofdata in transit at any given moment.This required extensions to TCP, forexample, to prevent wrapping of se-quence numbers [Borman et al. 1992].

Fiber optics has improved the qualityof communication links, thus shiftingthe major source of network errors —atleast for wired networks— from line biterrors to NPDU losses due to conges-tion. This fact is exploited by TCP’sCongestion Avoidance and Fast Re-transmit and Recovery algorithmswhich were made necessary because ofthe exponential growth in the use of theInternet — even before the Web madethe Internet a household word. Conges-tion avoidance continues to be an activeresearch area for the TCP community[Brakmo et al. 1994; Jacobson 1988;Stevens 1997].

Higher speed links and fiber opticshave caused some designers of so-calledlight-weight transport protocols to shifttheir design goals from minimizingtransmission costs (at the expense ofprocessing power) to minimizing pro-cessing requirements (at the expense oftransmission bandwidth) [Doeringer etal. 1990]. On the other hand, the prolif-eration of relatively low speed point-to-point (PPP) and wireless links has re-sulted in a varying and complexinterconnection of low, medium, andhigh speed links with varying degrees ofloss. This has introduced a new compli-cation into the design of flow controland error control algorithms.

New applications such as transactionprocessing, audio/video transmission, andthe Web have resulted in new and widelyvarying network service demands.

Over the years, TCP has been opti-mized for a particular mix of applica-tions: that is, bulk transfer (e.g., FileTransfer Protocol (FTP) and SimpleMail Transfer Protocol (SMTP)) and re-mote terminal traffic (e.g., telnet andrlogin). The introduction of the Web hascreated new performance challenges, as

the request/response nature of Web in-teractions is a poor match for TCP’sbyte-stream [Heidemann 1997]. Evenbefore the introduction of the Web, in-terest in request/response (transaction)protocols was reflected in the develop-ment of VMTP and T/TCP.

The delay and jitter sensitivity of au-dio and video based applications re-quired transport protocols providingsomething other than reliable service.This has led to increased usage of UDP,which lacks congestion control features,resulting in increased Internet conges-tion. This has created a need for proto-cols that incorporate TCP-compatiblecongestion control, yet provide servicesappropriate for audio/video transmis-sion (e.g., maybe-loss or controlled-loss).Audio/video transmission also requiredsynchronization as enabled by RTPtimestamps.

Finally, distributed applications in-volving one-to-many, many-to-one, andmany-to-many communication have re-quired a new multicast paradigm ofcommunication. Unicast protocols suchas XTP, VMTP, TP11, and UDP havebeen extended to support multicast.

6.2 Wireless Networks

Regarding the future of the transportlayer, we note the influence of the rapidgrowth of wireless networks. Althoughtransport protocols are supposed to beindependent of the underlying network-ing technology, in practice, TCP is tai-lored to perform in networks wherelinks are tethered. As more wirelesslinks are deployed, it becomes morelikely that the path between transportsender and transport receiver will notbe fully wired. Designing a transportprotocol that performs well over a heter-ogeneous network is difficult becausewireless networks have two inherentdifferences that require, if not their ownspecific transport protocols, at least spe-cific variations of existing ones.

The first difference is the major causefor TPDU gaps at the transport receiver.In wired networks missing, TPDUs are



primarily due to network congestion asrouters discard NPDUs. In wireless net-works, gaps are most likely due to biterrors and hand-off problems. On de-tecting a gap, TCP’s transport receiverresponds by invoking congestion controland avoidance algorithms to slow thetransport sender and thereby reduce thenetwork load. For wireless networks, abetter response is to retransmit lost TP-DUs as soon as possible [Balakrishnanet al. 1996].

Recent studies have concentrated onalleviating the effects of non-congestion-related losses on TCP performance overwireless and other high-loss networks[Bakre and Badrinath 1997; Balakrish-nan et al. 1995; Yavatkar and Bhagwat1994]. These studies follow two funda-mentally different approaches. One ap-proach hides any noncongestion-relatedloss from the transport sender by mak-ing the lossy link appear as a higherquality link with reduced effectivebandwidth. As a result, the losses thatare seen by the transport sender aremostly due to congestion. Some exam-ples include: (1) reliable link layer pro-tocols (AIRMAIL [Ayanoglu et al.1995]), (2) splitting a transport connec-tion into two connections (I-TCP [Bakreand Badrinath 1997]), and (3) TCP-aware link layer schemes (snoop [Bal-akrishnan et al. 1995]).

The second approach makes thetransport sender aware that wirelesshops exist and does not invoke conges-tion control for TPDU losses. SelectiveACK proposals such as TCP SACK [Ma-this et al. 1996] and SMART [Keshavand Morgan 1997] can be consideredexamples of this approach. One studyshows that a reliable link layer protocolwith some knowledge of TCP provides10-30% higher throughput than a linklayer protocol without that knowledge[Balakrishnan et al. 1996]. Further-more, selective ACKs and explicit lossnotifications result in significant perfor-mance improvements.

The second inherent difference is thatwireless devices are constrained in boththeir computing and communication

power due to limited power supply.Therefore, the transport protocols usedon these devices should be less complexto allow efficient battery usage. Thisapproach is taken in Mobile-TCP [Haasand Agrawal 1997]. It uses the samesplitting approach used in I-TCP, how-ever, instead of identical transport enti-ties on both ends of the wireless link,Mobile-TCP employs an asymmetri-cally-based protocol design which re-duces the computation and communica-tion overhead on the mobile host’stransport entity. Because the wirelesssegment is known to be a single-hopconnection, several transport functionscan be either simplified or eliminated.

6.3 Debates

In reviewing the last two decades oftransport protocol research, we firstnote there seems to have been twoschools of thought on the direction oftransport protocol design: (1) the “hard-ware-oriented” school, and (2) the “ap-plication-oriented” school.

The hardware-oriented school feelsthat transport protocols should be sepa-rated from the operating system asmuch as possible, and implementationsshould be moved into VLSI, parallel, orspecial purpose processors [Feldmeier1993c; Haas 1990; La Porta andSchwartz 1992; Strayer and Weaver1988]. This school claims that the heavyusage of timers, interrupts, and memoryread/writes degrades the performance ofa transport protocol, and thus specialpurpose architectures are necessary.They would also point out that as net-work rates reach the Gigabit/sec range,it will be more difficult to process TP-DUs in real time. Therefore TPDU for-mats should be carefully chosen to allowparallel processing and to avoid TPDUfield limitations for sequence number-ing, window size, etc. This school ofthought is reflected in the design ofprotocols such as MSP and XTP.

By contrast, the application-orientedschool prefers moving some transportlayer functions out of the operating sys-



tem into the user application, therebyintegrating the upper layer end-to-endprocessing, and achieving a faster appli-cation processing pipeline. This schoolwould claim that the bottleneck is actu-ally in operations such as checksumsand presentation layer conversion, all ofwhich can be done more efficiently ifthey are integrated with the copying ofdata into application user space. Thisschool of thought is reflected in the con-cepts of Application Level Framing(ALF) and Integrated Layer Processing(ILP) [Clark and Tennenhouse 1990].ALF states that “the application shouldbreak the data into suitable aggregates,and the lower layers should preservethese frame boundaries as they processthe data”. ILP is an implementationconcept which “permits the implemen-tor the option of performing all (data)manipulation steps in one or two inte-grated processing loops”. Ongoing re-search shows that performance gainscan be obtained by using ALF and ILP[Ahlgren et al. 1996; Braun 1995; Braunand Diot 1995; 1996; Diot and Gagnon1999; Diot et al. 1995]. ALF conceptsare reflected in the design of protocolssuch as RTP, POC, and TRUMP.

A second debate, orthogonal to the de-bate about where transport servicesshould be implemented, is the debateabout functionality. Older transport ser-vices tend to focus on a single type ofservice. Some recent protocol designersadvocate high degrees of flexibility, or-thogonality, and user-configurability tomeet the requirements of various applica-tions. These designers feel that: (1) a sin-gle transport protocol offer different com-munication paradigms, and differentlevels of reliability and flow control, andthe transport user should be able tochoose among them, and (2) protocol de-sign should include multiple functions tosupport multicast, real-time data, syn-chronization, security, user-defined QoS,etc., that will meet the requirements oftoday’s more complex user applications.This is reflected in the designs of XTP,MSP and POC, for example.

A third debate concerns protocol opti-

mization. The lightweight protocolschool of thought argues that sincehigh-speed networks offer extremelylow error rates, protocols should be“success-oriented” and optimized formaximum throughput [Doeringer et al.1990; Haas 1990; La Porta andSchwartz 1992]. Others emphasize theincreased deployment of wireless net-works, which are more prone to bit er-rors; they argue that protocols should bedesigned to operate in both environ-ments efficiently. The SMART retrans-mission strategy [Keshav and Morgan1997] is a good example of the latterapproach.

6.4 Final Observations

Several of the experimental protocols sur-veyed in this paper present beneficialideas for the overall improvement oftransport protocols. However, most ofthese schemes are not widely used. Thismay have more to do with the difficulty ofintroducing new transport protocols thanwith the merits of the ideas themselves.

In the 1980’s, the primary battle wasbetween TCP and ISO TP0/4—or, morebroadly, between the Internet (TCP/IP)suite and OSI suite in general. Never-ending hallway discussions debatedwhich would win. Today the victor isTCP/IP, due to a combination of “tim-ing”, “technology”, “implementations,”and “politics” [Tanenbaum 1996]. Some-times the usefulness of transport proto-col research is questioned—even re-search to improve TCP—because themarket base of the current TCP versionis so large that inertia prevents newgood ideas from propagating into theuser community. However, the success-ful incorporation of SACK [Mathis1996], TCP over high-speed networks[Borman 1992], and T/TCP [Braden1994] into implementations clearlyshows that transport research is useful.What should be clear is that for anynew idea to have influence in practice inthe short term, it must be interoperablewith elements of the existing TCP/IPprotocol suite.



APPENDIX

Table II. (a) General features of TCP, UDP, TP4, and TP0

General Features TCP (Internet) UDP (Internet) TP4 (OSI) TP0 (OSI)

Well suited for Stream transportover unreliablenetworks

Applicationswhere normalcase is operationover LAN withlow loss rates(e.g., NFS, SNMP)

Message-orientedtransport overunreliablenetworks

Message-orientedtransport overreliable networks

Developed for General purposereliable datacommunicationover unreliablenetworks

General purposedatagramtransport withminimum ofprotocolmechanism

General purposereliable datacommunicationover unreliablenetworks

Minimumfunction protocolover reliable COnetworks

Features for specialapplications

Nagle algorithm1

for interactive dataNone Oriented for

teletext; OSI-TCPgateway2

Underlying service IP IP Unreliablenetwork

Reliable COnetwork (e.g., X.25)

Has beenimplemented in Software Software Software Software

1Since interactive data TPDUs often carry 1 byte of user data, overhead of 40-byte header (20 for TCP;20 for IP) adds to congestion in WANs. Therefore, Nagle Algorithm restricts a TCP connection to havingat most one outstanding unacknowledged small TPDU. No additional small TPDUs can be sent until theACK is received.2TP0 can be used to create OSI transport service on top of TCP’s reliable byte-stream service, enablingOSI apps to run over TCP/IP [Rose and Cass 1987].

Table II. (b) Service features of TCP, UDP, TP4, and TP0

Service Features TCP (Internet) UDP (Internet) TP4 (OSI) TP0 (OSI)

CO-byte vs. CO-message vs. CL CO-byte CL CO-message CO-message

ReliabilityNo-loss vs. Uncontrolled-loss vs. Controlled-loss

No-loss Uncontrolled-loss

No-loss No-loss3

No-duplicates vs. Maybe-duplicates

No-duplicates Maybe-duplicates

No-duplicates No-duplicates3

Ordered vs. Unordered vs.Partially-ordered

Ordered Unordered Ordered Ordered3

Data-integrity vs. No-data-integrity vs. Partial-data-integrity

Data-integrity Data-integrityand No-data-integrity4

Data-integrity Data-integrity3

Multicast vs Unicast Unicast Unicast5 Unicast Unicast

Priority vs. No-priority No-priority Priority Priority Priority

Security vs. No-security No-security No-security Security Security6

3Assumed to be part of the underlying network service.4UDP provides an optional checksum on the header and user data.5UDP can provide multicast service with IP support.6Protection is a QoS parameter that can be specified as protection against passive monitoring, passivemodification, replay, addition, or deletion.



Table II. (c) Protocol features of TCP, UDP, TP4, and TP0

Protocol Features TCP (Internet) UDP (Internet) TP4 (OSI) TP0 (OSI)

CO vs CL Protocol CO CL CO CO

Transaction-oriented No No1 No No

CO ProtocolIn-band vs. Out-of-bandSignaling

In-band N/A In-band In-band

Unidirectional vs.Bidirectional conn

Bidirectional N/A Bidirectional Bidirectional

Conn establishment 3-way N/A 3-way 3-way

User data in connEstablishment

Notpermitted2

N/A Permitted3 Not permitted

Conn termination 4-way N/A 3-way N/A4

AcknowledgmentsPiggybacking Yes N/A Yes N/A5

Cumulative vs. Selective Cumulative6 N/A Selective N/A5

Sender-dependent vs.Sender-independent

Sender-independent

N/A Sender-dependent andoptional Sender-independent

N/A5

Error ControlError detection Sequence no;

Checksum onheader anddata

Optionalchecksum onheader anddata

Sequence no;Length field;Optionalchecksum onheader and data

No

Error reporting No No No No

Error recovery PAR No PAR No

Flow/Congestion ControlEnd-to-end flow control Window No Window Backpressure4

Window allocation Byte-oriented N/A TPDU-oriented N/A

Rate control parameters N/A N/A N/A N/A

Congestion control Implicitaccess control

No Implicit accesscontrol

No

TPDU FormatTPDU numbering Byte-oriented No TPDU-oriented No

Min TPDU header/trailer 20-byte header 8-byte header 5-byte header 3-byte header

Multiplexing/Demultiplexing Yes Yes Yes No

Splitting/Recombining No No Yes No

Concatenation/Separation Yes7 No Yes No

Blocking/Deblocking Yes8 No Yes Yes

Segmentation/Reassembly Yes No Yes Yes1But frequently used for transaction based applications.2While data may be sent in TCP connection opening TPDU, it cannot be delivered to user receiver until3-way-handshake is complete.3Not .32 octets. This data (e.g., password) may help decide if connection should be established, therebyallowing estab to depend on user data.4Assumed to be part of the underlying network service.5There are no ACKs in TP0.6A version of TCP which combines selective ACKs and cumulative ACKs has been specified [Mathis etal. 1996].7Data and ACK TPDU is concatenated in the case of piggybacking.8However, boundaries are not preserved between transport sender and transport receiver.



Table III. (a) General features of NETBLT, VMTP, T/TCP, and RTP

General Features NETBLT (Internet) VMTP (Internet) T/TCP (Internet) RTP (Internet)

Well suited for Long-delay paths Transactions Transactions Real-time data

Developed for Bulk datatransfer

RemoteProcedure Calls,multicast, real-timecommunication

Improving TCPperformance fortransactionprocessing

Real-time, multi-participantmultimediaconferences


User controlledwindow-basedflow control

Namingmechanism forprocessmigration, mobilehosts;Conditionalmessage deliveryfor real-timeapplication;Optimized forpage-levelnetwork fileaccess; Callforwarding

3-way can bebypassed;Shorter delay inTIME-WAITstate

Integrated layerprocessing; QoSfeedback; Can betailored tospecificapplications;Mixing,translating viaSSRC fields

Underlying service IP Datagramnetwork

IP UDP,1 AALS

Has beenimplemented in Software Software Software Software

1RTP may be used with other suitable underlying network or transport protocols.

Table III. (b) Service features of NETBLT, VMTP, T/TCP, and RTP

Service Features NETBLT(Internet) VMTP (Internet) T/TCP (Internet) RTP (Internet)

CO-byte vs. CO-message vs. CL CO-message2 CL CO-byte andCL3

CL

ReliabilityNo-loss vs. Uncontrolled-loss vs. Controlled-loss

No-loss No-loss No-loss Uncontrolled-loss


No-duplicates No-duplicates No-duplicates No-duplicates

Ordered vs. Unordered vs.Partially-ordered

Ordered Ordered Ordered Unordered


Data-integrity Data-integrity Data-integrity No-data-integrity

Multicast and Unicast Unicast Multicast andUnicast

Unicast Unicast4

Priority vs. No-priority No-priority Priority No-priority No-interrupt/Priority

Security vs. No-security No-security Security5 No-security Security2In this case message is a “buffer” as explained in section 5.4.3T/TCP provides CO or CL service. For CO service, transport user has to issue the open, send, and closeprimitives as in TCP. For CL service, only the sendto primitive is required.4RTP can provide multicast service if available in lower layer protocols.5VMTP’s security feature was designed but never implemented.



Table III. (c) Protocol features of NETBLT, VMTP, T/TCP, and RTP

Protocol Features NETBLT (Internet) VMTP (Internet) T/TCP (Internet) RTP (Internet)

CO vs CL protocol CO CO CO CL1

Transaction-oriented No Yes Yes No

CO protocolIn-band vs. Out-of-bandsignaling

In-band In-band In-band N/A


Unidirectional Unidirectional Bidirectional N/A

Conn establishment 2-way Implicit 3-way andImplicit

N/A

User data in connEstablishment

Not permitted Permitted Permitted N/A

Conn termination 2-way Implicit 4-way N/A

AcknowledgmentsPiggybacking N/A No2 Yes N/A

Cumulative vs. Selective Cumulative Selective Cumulative N/A


Sender-dependent

Sender-dependent

Sender-dependent

N/A

Error ControlError detection Sequence No;

Length field;Headerchecksum;Optional datachecksum

Transaction ID;Length field;Checksum onheader anddata

Sequence No;Checksum onheader anddata

Sequence No

Error reporting Selective reject Selective reject No Specialreports3

Error recovery ARQ withselectiveretransmission

ARQ withselectiveretransmission

PAR No

Flow/Congestion ControlEnd-to-end flow control Window and

RateRate Window No

Window scheme Buffer-oriented N/A Byte-oriented N/A

Rate control parameters Burst size andrate

Interpacketdelay

N/A N/A

Congestion control No Implicit accesscontrol

Implicit accesscontrol

No4

TPDU FormatTPDU numbering TPDU-oriented TPDU-oriented Byte-oriented TPDU-oriented

Min TPDUheader/trailer

24-byte header 64-byte header 24-byte header 12-byte header

Multiplexing/Demultiplexing Yes Yes Yes Yes5

Splitting/Recombining No No No No

Concatenation/Separation No No No No

Blocking/Deblocking No No Yes No

Segmentation/Reassembly Yes Yes Yes No1RTP has no notion of a connection; however RTCP provides out-of-band signaling.2Normally a response implicitly acks request; each new request implicitly acks last response rec’d. ThusVMTP provides implicit but not explicit piggybacked ACK info.3Sender and Receiver reports are exchanged via RTCP, a control protocol that monitors data deliveryand provides minimal control and id functions.4Application may perform congestion control by using the QoS feedback provided by the RTCP controlmessages.5Visa synchronization source identifier (SSRC) field.



Table IV. (a) General features of SNA/APPN, DECnet/NSP, XTP, and ATM/SSCOP/AAL5

General Features APPN (SNA) NSP (DECnet) XTP SSCOP/AAL5 (ATM)

Well suited for Distributedsystems

Message orientedover unreliablenetworks; dist’dsystems

Reliabletransportmulticast; Real-time datagrams;Fast connectionsetup

High speedreliable data andcontrol signaling

Developed for Diverseplatforms,topologies, andapps into singlenetwork

General purposereliable datacomm overunreliable nets

Single protocolseparates policyfrom mechanism;VLSIimplementation

ATM networks


Class of Serviceselection1

Expedited data Out-of-banddata2;Orthogonality;Selectable flowand error control

blockPACKs/NACKs

Underlying service APPN networklayer (virtualcircuit)

DECnet layer 3protocol

Any networklayer3 or directlyover LLC, MAC,ATM AALs

ATM

Has beenimplemented in Software Software4

Software andHardware

1COS-based routing is provided by APPN network layer. APPN transport takes advantage by usingdifferent service classes.2Useful for passing control info about state of transport user processes, or passing semantic info aboutdata, or for event sequencing via timestamps.3Most of all XTP implementations operate over IP.4XTP was designed for but never implemented in VLSI.

Table IV. (b) Service features of SNA/APPN, DECnet/NSP, XTP, and ATM/SSCOP/AAL5

Service Features APPN (SNA) NSP (DECnet) XTP SSCOP/AAL5 (ATM)

CO-byte vs. CO-message vs.CL

CO-message CO-message CL and CO-byte

CO-message

ReliabilityNo-loss vs.Uncontrolled-loss vs. Controlled-loss

No-loss5 No-loss No-loss andUncontrolled-loss

No-loss


No-duplicates5 No-duplicates No-duplicatesandDuplicates

No-duplicates

Ordered vs. Unorderedvs. Partially-ordered

Ordered5 Ordered Ordered andUnordered

Ordered


Data-integrity5

No-data-integrity

Data-integrityand No-data-integrity

Data-integrity6

Multicast and Unicast Unicast Unicast Multicast andUnicast

Unicast

Priority vs No-priority Priority Priority Priority Priority7

Security vs No-security Security No-security No-security No-security5Provided by the underlying network.6Assumed to be provided by underlying service. SSCOP itself has no checksum.7I.365.3 allows optionally expedited data using SSCOP unassured data delivery.



Table IV. (c) Protocol features of SNA/APPN, DECnet/NSP, XTP, and ATM/SSCOP/AAL5

Protocol Features APPN (SNA) NSP (DECnet) XTP SSCOP/AAL5 (ATM)

CO vs. CL protocol CO CO CO COTransaction-oriented Yes No Yes NoCO protocol

In-band vs. Out-of-bandsignaling

Out-of-band In-band In-band N/A1


Uni andBidirectional

Bidirectional Bidirectional Bidirectional

Conn establishment 2-way Implicit 2-wayUser data in connestablishment

No Yes Permitted Yes2

Conn termination 2-way 4-way3 2-wayAcknowledgments

Piggybacking N/A4 Yes No NoCumulative vs Selective N/A4 Both Selective Both5


N/A4 Sender-independent

Sender-independent6

Error controlError detection No7 Sequence No Sequence No;

Length field;Headerchecksum;Optionaldatachecksum

Sequence No;Checksum onheader and data

Error reporting No7 Selectivereject

Selective reject8

Error recovery No7 PAR ARQ withGo-Back-N(optional selretrans)

ARQ withselective repeat

Flow/Congestion controlEnd-to-end flow control N/A9 Window Window and

RateWindow-credit

Window scheme N/A9 TPDU(segment)-oriented

Byte-oriented

TPDU-oriented

Rate control parameters N/A9 N/A Burst sizeand rate

N/A

Congestion control N/A9 Explicitaccess control

Explicitaccesscontrol

No10

TPDU formatTPDU numbering TPDU-oriented TPDU-oriented Byte-oriented TPDU-orientedMin TPDU header/trailer 9-byte header 32-byte header

Multiplexing/Demultiplexing No No Yes NoSplitting/Recombining No No No NoConcatenation/Separation No Yes11 No NoBlocking/Deblocking No No No NoSegmentation/Reassembly Yes Yes Yes Yes1See Section 5.11.2But no guarantee of delivery.3XTP also employs 2-way handshake and abortive close modes of connection termination.4APPN does not need ACKs because it assumes a fully reliable network service.5STAT is full map of missing/ACKed PDUs.6Sender periodically polls receiver.7Responsibility of the underlying network.8Transport receiver sends one USTAT (NACK) which provides selective reject info for every missing PDU.9APPN does not perform flow/congestion control at transport layer; relies on network layer’s “adaptive pacing”.10Local flow control backoff is discussed, but not specified/standardized.11Data and ACK TPDU is concatenated in the case of piggybacking.



ACKNOWLEDGMENTS

The authors wish to thank several indi-viduals: R. Marasli and E. Golden whocontributed to an early draft of thispaper; a number of protocol experts whoreviewed individual service/protocolsummaries and their respective tableentries, including A. Agrawala, A. Bhar-gava, R. Case, L. Chapin, B. Dempsey,D. Feldmeier, T. La Porta, D. Piscitello,K. Sabnani, H. Schulzrinne, D. Sanghi,T. Strayer, E. Tremblay, R. Watson, A.Weaver, C. Williamson; M. Taube andthe anonymous reviewers who madevaluable suggestions.

REFERENCES

AHLGREN, B., BJORKMAN, M., AND GUNNINGBERG,P. 1996. Integrated layer processing can behazardous to your performance. In Proceed-ings of the Conference on Protocols for High-Speed Networks, V (France, Oct.), W. Dabbousand C. Diot, Eds. IFIP, 167–181.

AHN, J. S., DANZIG, P. B., LIU, Z., AND YAN,L. 1995. Evaluation of TCP Vegas: Emula-tion and experiment. SIGCOMM Comput.Commun. Rev. 25, 4 (Oct.), 185–205.

AMER, P. D., CHASSOT, C., CONNOLLY, T. J., DIAZ,M., AND CONRAD, P. 1994. Partial-ordertransport service for multimedia and otherapplications. IEEE/ACM Trans. Netw. 2, 5(Oct. 1994), 440–456.

AMER, P., CONRAD, P., GOLDEN, E., IREN, S., ANDCARO, A. 1997. Partially ordered, partiallyreliable transport service for multimedia ap-plications. In Proceedings of the Conferenceon Advanced Telecommunications/Informa-tion Distribution Research Program (CollegePark, MD, Jan.).

AMER, P. D., IREN, S., SEZEN, G., CONRAD, P.,TAUBE, M., AND CARO, A. 1999. Network-conscious GIF image transmission. Comput.Netw. J. 31, 7 (Apr.), 693–708.

ARMSTRONG, S., FREIER, A., AND MARZULLO,K. 1992. Multicast transport protocol.RFC 1301.

AYANOGLU, E., PAUL, S., LAPORTA, T. F., SABNANI,K. K., AND GITLIN, R. D. 1995. AIRMAIL: alink-layer protocol for wireless networks.Wireless Networks 1, 1 (Feb. 1995), 47–60.

BAE, J., SUDA, T., AND WATANABE, N. 1991.Evaluation of the effects of protocol process-ing overhead in error recovery schemes for ahigh-speed packet switched network: link-by-link versus edge-to-edge schemes. IEEE J.Sel. Areas Commun. 9 (Dec.), 1496–1509.

BAKRE, A. V. AND BADRINATH, B. R. 1997.Implementation and performance evaluation

of indirect TCP. IEEE Trans. Comput. 46, 3,260–278.

BALAKRISHNAN, H., PADMANABHAN, V. N., SESHAN,S., AND KATZ, R. H. 1996. A comparison ofmechanisms for improving TCP performanceover wireless links. SIGCOMM Comput.Commun. Rev. 26, 4, 256–269.

BALAKRISHNAN, H., SESHAN, S., AND KATZ, R.H. 1995. Improving reliable transport andhandoff performance in cellular wireless net-works. Wireless Networks 1, 4, 469–481.

BERNERS-LEE, T., FIELDING, R., AND FRYSTYK,H. 1996. Hypertext transfer protocol--HTTP/1.0. RFC 1945.

BERTSEKAS, D. AND GALLAGER, R. 1992. DataNetworks. 2nd ed. Prentice-Hall, Inc., Up-per Saddle River, NJ.

BHARGAVA, A., KUROSE, J. F., TOWSLEY, D., AND

VANLEEMPUT, G. 1991. Performance com-parison of error control schemes in high-speedcomputer communication networks. InBroadband Switching: Architectures, Proto-cols, Design, and Analysis, C. Chas, V. K.Konangi, and M. Sreetharan, Eds. IEEEComputer Society Press, Los Alamitos, CA,416–426.

BIERSACK, E., COTTON, C., FELDMEIER, D., MCAU-LEY, A., AND SINCOSKIE, W. 1992. Gigabitnetworking research at Bellcore. IEEE Net-work 6, 2 (Mar.), 42–48.

BLACK, U. 1995. ATM Foundation for Broad-band Networks. Prentice-Hall series in ad-vanced communications technologies. Pren-tice-Hall, Inc., Upper Saddle River, NJ.

BORMAN, D., BRADEN, B., AND JACOBSON, V.1992. TCP extensions for high perfor-mance. RFC 1323.

BORMANN, C., OTT, J., GEHRCKE, H., AND KERSCHAT,T. 1994. MTP-2: towards achieving thes.e.r.o. properties for multicast transport. InProceedings of the International Conference onComputer Communications Networks (SanFrancisco, CA, Sept.),

BRADEN, R. 1992a. Extending TCP for transac-tions -- Concepts. RFC 1379.

BRADEN, R. 1992b. TIME-WAIT assassinationhazards in TCP. RFC 1337.

BRADEN, R. 1994. T/TCP -- TCP extensions fortransactions functional specification. RFC1644.

BRADEN, R. AND JACOBSON, V. 1988. TCP exten-sions for long-delay paths. RFC 1072.

BRADEN, R., ZHANG, L., BERSON, S., HERZOG, S.,AND JAMIN, S. 1997. Resource reservationprotocol (RSVP) -- Version 1 functional speci-fication. RFC 2205.

BRAKMO, L. S., O’MALLEY, S. W., AND PETERSON, L.L. 1994. TCP Vegas: New techniques forcongestion detection and avoidance. SIG-COMM Comput. Commun. Rev. 24, 4 (Oct.),24–35.

BRAUDES, R. AND ZABELE, S. 1993. Require-ments for multicast protocols. RFC 1458.



BRAUN, T. 1995. Limitations and implementa-tion experiences of integrated layer processing.In Proceedings of the Conference on GISI,Springer-Verlag, New York, 149–156.

BRAUN, T. AND DIOT, C. 1995. Protocol imple-mentation using integrated layer processing.SIGCOMM Comput. Commun. Rev. 25, 4(Oct.), 151–161.

BRAUN, T. AND DIOT, C. 1996. Automated codegeneration for integrated layer processing.In Proceedings of the Conference on Protocolsfor High-Speed Networks, V (France, Oct.), W.Dabbous and C. Diot, Eds. IFIP, 182–197.

BROWN, G. M., GOUDA, M. G., AND MILLER, R. E.1989. Block acknowledgement: redesigningthe window protocol. SIGCOMM Comput.Commun. Rev. 19, 4 (Sept.), 128–135.

CHERITON, D. 1988. VMTP: Versatile messagetransaction protocol: Protocol specification.RFC 1045.

CHERITON, D. AND WILLIAMSON, C 1989. VMTPas the transport layer for high performancedistributed systems. IEEE Commun. Mag.27, 6 (June), 37–44.

CHIONG, J. 1996. SNA Interconnections: Bridg-ing and Routing SNA in Hierarchical, Peer,and High-Speed Networks. McGraw-Hill se-ries on computer communications. McGraw-Hill, Inc., New York, NY.

CLARK, D. 1982. Window and acknowlegementstrategy in TCP. RFC 813.

CLARK, D., LAMBERT, M., AND ZHANG, L.1987. NETBLT: A bulk data transfer proto-col. RFC 998.

CLARK, D. D. AND TENNENHOUSE, D. L. 1990.Architectural considerations for a new gener-ation of protocols. SIGCOMM Comput. Com-mun. Rev. 20, 4 (Sept.), 200–208.

COLE, R., SHUR, D., AND VILLAMIZAR, C. 1996. IPover ATM: A framework document. RFC1932.

CONNOLLY, T., AMER, P., AND CONRAD, P. 1994.An extension to TCP: Partial order ser-vice. RFC 1693.

CONRAD, P. 2000. Partial order and partial reli-ability transport service innovations in a mul-timedia application context. (In pro-gress). Ph.D. Dissertation. University ofDelaware, Newark, DE.

CONRAD, P., GOLDEN, E., AMER, P., AND MARASLI, R.1996. A multimedia document retrieval sys-tem using partially-ordered/partially-reliabletransport service. In Proceedings of the Con-ference on Multimedia Computing and Net-working (San Jose, CA, Jan.).

DABBOUS, W. S. AND DIOT, C. 1997. High-perfor-mance protocol architecture. Comput. Netw.ISDN Syst. 29, 7, 735–744.

DEERING, S. 1989. Host extensions for IP multi-casting. RFC 1112.

DIOT, C. AND GAGNON, F. 1999. Impact of out-of-sequence processing on the performance ofdata transmission. Comput. Netw. J. 31, 5(Mar.), 475–492.

DIOT, C., HUITEMA, C., AND TURLETTI, T.1995. Multimedia applications should beadaptive. In Proceedings of the Workshop onHPCS (Mystic, CT, Aug.), IFIP.

DOERINGER, W., DYKEMAN, D., KAISERWERTH, M.,MEISTER, B., RUDIN, H., AND WILLIAMSON, R.1990. A survey of lightweight transport pro-tocols for high speed networks. IEEE Trans.Commun. 38, 11 (Nov.), 2025–2039.

DORLING, B., LENHARD, P., LENNON, P., AND USKOK-OVIC, V. 1997. Inside APPN and HPR: TheEssential Guide to the New SNA. Prentice-Hall, New York, NY.

DOSHI, B., JOHRI, P., NETRAVALI, A., AND SABNANI,K. 1993. Error and flow control perfor-mance of a high speed protocol. IEEE Trans.Commun. 41, 5 (May).

DUPUY, S., TAWBI, W., AND HORLAIT, E.1992. Protocols for high-speed multimediacommunications networks. Comput. Com-mun. 15, 6 (July/Aug. 1992), 349–358.

FALL, K. AND FLOYD, S. 1996. Simulation-basedcomparisons of Tahoe, Reno and SACK TCP.SIGCOMM Comput. Commun. Rev. 26, 3,5–21.

FELDMEIER, C. C. 1990. Multiplexing issues incommunication system design. SIGCOMMComput. Commun. Rev. 20, 4 (Sept.), 209–219.

FELDMEIER, D. 1993a. A framework of architec-tural concepts for high-speed communicationssystems. IEEE J. Sel. Areas Commun. 11, 4(May), 480–488.

FELDMEIER, D. 1993b. An overview of theTP11 transport protocol project. In HighPerformance Networks--Frontiers and Experi-ence, A. Tantawy, Ed. Kluwer AcademicPublishers, Hingham, MA, 157–176.

FELDMEIER, D. 1993c. A survey of high perfor-mance protocol implementation techniques.In High Performance Networks---Technologyand Protocols, A. Tantawy, Ed. Kluwer Ac-ademic Publishers, Hingham, MA, 29–50.

FELDMEIER, D. AND BIERSACK, E. 1990.Comparison of error control protocols for highbandwidth-delay product networks. In Pro-ceedings of the Conference on Protocols forHigh-Speed Networks, II (Palo Alto, CA,Nov.), M. Johnson, Ed. North-Holland Pub-lishing Co., Amsterdam, The Netherlands,271–295.

FLOYD, S. 1996. Issues of TCP with SACK.Tech. Rep.. Lawrence Livermore NationalLaboratory, Livermore, CA.

FLOYD, S. AND JACOBSON, V. 1993. Randomearly detection gateways for congestion avoid-ance. IEEE/ACM Trans. Netw. 1, 4 (Aug.1993), 397–413.

FLOYD, S., JACOBSON, V., MCCANNE, S., LIU, C.-G.,AND ZHANG, L. 1995. A reliable multicastframework for light-weight sessions and ap-plication level framing. SIGCOMM Comput.Commun. Rev. 25, 4 (Oct.), 342–356.



FREEMAN, R. L. 1995. Practical Data Communi-cations. Wiley series in telecommunicationsand signal processing. Wiley-Interscience,New York, NY.

GOLDEN, E. 1997. TRUMP: Timed-reliabilityunordered message protocol. Master’s The-sis. University of Delaware, Newark, DE.

HAAS, Z. 1999. A communication architecturefor high-speed networking. In Proceedings ofthe Conference on IEEE INFOCOM 2000 (SanFrancisco, CA), IEEE Computer SocietyPress, Los Alamitos, CA, 433–441.

HAAS, Z. AND AGRAWAL, P. 1997. Mobile-TCP:An asymmetric transport protocol design formobile systems. In Proceedings of the IEEEConference on Computer Communications,IEEE Press, Piscataway, NJ.

HAN, R. AND MESSERSCHMITT, D. 1996.Asymptotically reliable transport of multime-dia/graphics over wireless channels. In Pro-ceedings of the Conference on MultimediaComputing and Networks (San Jose,CA), IEEE Computer Society Press, LosAlamitos, CA, 99–110.

HEIDEMANN, J. 1997. Performance interactionsbetween P-HTTP and TCP implementations.SIGCOMM Comput. Commun. Rev. 27, 2, 65–73.

HENDERSON, T. R. 1995. Design principles andperformance analysis of SSCOP: a new ATMadaptation layer protocol. SIGCOMM Com-put. Commun. Rev. 25, 2 (Apr. 1995), 47–59.

IREN, S. 1999. Network-conscious image com-pression. Ph.D. Dissertation. University ofDelaware, Newark, DE.

IREN, S. AND AMER, D. 2000. SPIHT-NC: Net-work-conscious zerotree encoding. In Pro-ceedings of the on 2000 IEEE Data Compres-sion Conference (Snowbird, Ut., Mar.), IEEEComputer Society Press, Los Alamitos, CA.

ISO. 1989. International standard 7498-2. In-formation processing systems, open systemsinterconnection, basic reference model, part 2:Security architecture. International Stan-dards Organization.

ITU-T. 1994a. Recommendation Q.2110. B-ISDNATM adaptation layer-service specific connec-tion oriented protocol (SSCOP).

ITU-T. 1994b. Recommendation X.200. Infor-mation technology, open systems interconnec-tion: Basic reference model.

ITU-T. 1995a. Recommendation I.365.3. B-ISDNATM adaptation layersublayers: Service-spe-cific coordination function to provide the con-nection-oriented transport service (SSCF-COTS).

ITU-T. 1995b. Recommendation Q.2144. B-ISDNsignalling ATM adaptationlayer (SAAL)-layermanagement for the SAAL at the networknode interface (NNI).

ITU-T. 1995c. Recommendation X.214. Infor-mation technology, open systems interconnec-tion, transport service definition.

JACOBSON, V. 1988. Congestion avoidance andcontrol. SIGCOMM Comput. Commun. Rev.18, 4 (Aug. 1988), 314–329.

JAIN, R. 1989. A delay-based approach for con-gestion avoidance in interconnected heteroge-neous computer networks. SIGCOMM Com-put. Commun. Rev. 19, 5 (Oct. 1989), 56–71.delay.ps|.

KARN, P. AND PARTRIDGE, C. 1987. Improvinground-trip time estimates in reliable trans-port protocols. SIGCOMM Comput. Com-mun. Rev. 17, 5 (Aug. 1987), 2–7.

KENT, C. A. AND MOGUL, J. C. 1987.Fragmentation considered harmful. SIG-COMM Comput. Commun. Rev. 17, 5 (Aug.1987), 390–401.

KESHAV, S. 1991. A control-theoretic approachto flow control. SIGCOMM Comput. Com-mun. Rev. 21, 4 (Sept. 1991), 3–15.

KESHAV, S. 2000. Packet-pair flow control. (Toappear). IEEE/ACM Trans. Netw. 8.

KESHAV, S. AND MORGAN, S. 1997. SMART re-transmission: Performance with overload andrandom losses. In Proceedings of on IEEEINFOCOM 1997 (Kobe City, Japan,Apr.), IEEE Computer Society Press, LosAlamitos, CA.

LA PORTA, T. F. AND SCHWARTZ, M. 1992.Architectures, features, and implementationof high-speed transport protocols. In Com-puter Communications: Architectures, Proto-cols, and Standards, W. Stallings, Ed. IEEEPress, Piscataway, NJ, 217–225.

LA PORTA, T. F. AND SCHWARTZ, M. 1993a. Themultistream protocol: A highly flexible high-speed transport protocol. IEEE J. Sel. AreasCommun. 11, 4 (May), 519–530.

LA PORTA, T. F. AND SCHWARTZ, M. 1993b.Performance analysis of MSP: feature-richhigh-speed transport protocol. IEEE/ACMTrans. Netw. 1, 6 (Dec. 1993), 740–753.

LIN, S. AND COSTELLO, D. 1982. Error ControlCoding. Prentice-Hall, New York, NY.

LISKOV, B., SHRIRA, L., AND WROCLAWSKI, J.1990. Efficient at-most-once messages basedon synchronized clocks. SIGCOMM Comput.Commun. Rev. 20, 4 (Sept.), 41–49.

LUNDY, G. M. AND TIPICI, H. A. 1994.Specification and analysis of the SNR high-speed transport protocol. IEEE/ACM Trans.Netw. 2, 5 (Oct. 1994), 483–496.

MARASLI, R. 1997. Performance analysis of par-tially ordered and partially reliable transportservices. Ph.D. Dissertation. University ofDelaware, Newark, DE.

MARASLI, R., AMER, P., AND CONRAD, P. 1996.Retransmission-based partially reliable ser-vices: An analytic model. In Proceedings ofon IEEE INFOCOM 1996 (San Francisco, Ca-lif., Mar.), IEEE Computer Society Press,Los Alamitos, CA.

MARASLI, R., AMER, P. D., AND CONRAD, P. T.1997. An analytic study of partially ordered



transport services. Comput. Netw. ISDNSyst. 29, 6, 675–699.

MARTIN, J. AND LEBEN, J. 1992. DECnet PhaseV: An OSI Implementation. Prentice-Hall,Inc., Upper Saddle River, NJ.

MATHIS, M., MAHDAVI, J., FLOYD, S., AND ROMANOW,A. 1996. TCP selective acknowledgmentoptions. RFC 2018.

MCAULEY, D. 1990. Protocol design for highspeed networks. Ph.D. Dissertation. Uni-versity of Cambridge, Cambridge, UK.

MCCANNE, S., JACOBSON, V., AND VETTERLI, M.1996. Receiver-driven layered multi-cast. SIGCOMM Comput. Commun. Rev. 26,4, 117–130.

MCNAMARA, J. E. 1988. Technical Aspects ofData Communication. 3rd ed. DigitalPress, Newton, MA.

NAGLE, J. 1995. Congestion control in IP/TCPinternetworks. SIGCOMM Comput. Com-mun. Rev. 25, 1 (Jan. 1995), 61–65.

NETRAVALI, A., ROOME, W., AND SABNANI, K.1990. Design and implementation of a highspeed transport protocol. IEEE Trans. Com-mun. 38, 11, 2010–2024.

PARTRIDGE, C., HUGHES, J., AND STONE, J.1995. Performance of checksums and CRCsover real data. SIGCOMM Comput. Com-mun. Rev. 25, 4 (Oct.), 68–76.

PARULKAR, G. AND TURNER, J. 1989. Towards aframework for high speed communication in aheterogeneous networking environment. InProceedings of on IEEE INFOCOM 1989 (Ot-tawa, Ont., Canada, Apr.), IEEE ComputerSociety Press, Los Alamitos, CA, 655–667.

PISCITELLO, D. AND CHAPIN, A. 1993. Open Sys-tems Networking, TCP/IP and OSI. 1sted. Addison-Wesley, Reading, MA.

POSTEL, J. 1980. User datagram protocol.RFC 768.

POSTEL, J. 1981. Transmission control protocol.RFC 793.

PRUE, W. AND POSTEL, J. 1987. Something ahost could do with source quench: The sourcequench introduced delay (SQuID). RFC1016.

RAMAKRISHNAN, K. K. AND JAIN, R. 1988. A bi-nary feedback scheme for congestion avoid-ance in computer networks with a connection-less network layer. SIGCOMM Comput.Commun. Rev. 18, 4 (Aug. 1988), 303–313.

ROBERTSON, D. 1996. Accessing Transport Net-works: MPTN and AnyNet Solutions.McGraw-Hill series on computer communica-tions. McGraw-Hill, Inc., Hightstown, NJ.

ROSE, M. AND CASS, D. 1987. ISO transport ser-vice on top of the TCP, version: 3. RFC 1006.

SANDERS, R. AND WEAVER, A. 1990. The Xpresstransfer protocol (XTP): A tutorial. SIG-COMM Comput. Commun. Rev. 20, 5 (Oct.),67–80.

SANGHI, D. AND AGRAWALA, A. K. 1993. DTP: Anefficient transport protocol. In Proceedingsof the IFIP TC6 Working Conference on Com-

puter Networks, Architecture and Applications(NETWORKS ’92, Trivandrum, India, Oct.28–29), S. V. Raghavan, G. V. Bochmann, andG. Pujolle, Eds. Elsevier Sci. Pub. B. V.,Amsterdam, The Netherlands, 171–180.

SCHULZRINNE, H. 1996. RTP profile for audioand video conferences with minimal control.RFC 1890.

SCHULZRINNE, H., CASNER, S., FREDERICK, R., AND

JACOBSON, V. 1996. RTP: A transport pro-tocol for real-time applications. RFC 1889.

SHENKER, S., PARTRIDGE, C., AND GUERIN, R.1997. Specification of guaranteed quality ofservice. RFC 2212.

SHENKER, S. AND WROCLAWSKI, J. 1997. Generalcharacterization parameters for integratedservice network elements. RFC 2215.

SMITH, W. AND KOIFMAN, A. 1996. A distributedinteractive simulation intranet using RAMP,a reliable adaptive multicast protocol. InProceedings of the 14th Workshop on Stan-dards for the Interoperability of DistributedSimulations (Orlando, FL, Mar.).

STALLINGS, W. 1997. Data and ComputerCommunications. 5th ed. Prentice-Hall,Inc., Upper Saddle River, NJ.

STALLINGS, W. 1998. High-Speed Networks:TCP/IP and ATM Design Principles. Pren-tice-Hall, Inc., Upper Saddle River, NJ.

STEVENS, W. R. 1994. TCP/IP Illustrated: TheProtocols (Vol. 1). Addison-Wesley LongmanPubl. Co., Inc., Reading, MA.

STEVENS, W. R. 1996. TCP/IP Illustrated: TCPfor Transactions, HTTP, NNTP, and the UnixDomain Protocols (Vol. 3). Addison-WesleyProfessional Computing Series. Addison-Wesley Publishing Co., Inc., Redwood City,CA.

STEVENS, W. 1997. TCP slow start, congestionavoidance, fast retransmit, and fast recoveryalgorithms. RFC 2001.

STEVENS, W. R. 1990. UNIX Network Program-ming. Prentice-Hall, Inc., Upper SaddleRiver, NJ.

STRAYER, W. T., DEMPSEY, B. J., AND WEAVER, A. C.1992. XTP: The Xpress Transfer Protocol.Addison-Wesley Publishing Co., Inc., Red-wood City, CA.

STRAYER, T. AND WEAVER, A. 1988. Evaluationof transport protocols for real-time communi-cations. Tech. Rep. TR-88-18. Departmentof Computer Science, University of Virginia,Charlottesville, VA.

TANENBAUM, A. S. 1996. Computer Networks.3rd. ed. Prentice-Hall, Inc., Upper SaddleRiver, NJ.

THAI, K., CHASSOT, C., FDIDA, S., DIAZ, M., AND

DIAZ, M. 1994. Transport layer for coopera-tive multimedia applications. Tech. Rep.94196. Editions du CNRS, Paris, France.

WALRAND, J. 1991. Communication Networks: AFirst Course. Richard D. Irwin, Inc., BurrRidge, IL.



WATSON, R. 1989. The Delta-T transport proto-col: Features and experience. In Proceedingsof the 14th Conference on Local ComputerNetworks (Minneapolis, MN, Oct. 1989),

WEAVER, A. 1994. The Xpress transfer protocol.Comput. Commun. 17, 1 (Jan.), 46–52.

WILLIAMSON, C. AND CHERITON, D. 1989. Anoverview of the VMTP transport protocol. InProceedings of the 14th Conference on LocalComputer Networks (Minneapolis, MN, Oct.1989),

WROCLAWSKI, J. 1997. Specification of the con-trolled-load network element service. RFC2211.

YANG, C. AND REDDY, A. 1995. A taxonomy forcongestion control algorithms in packetswitching networks. IEEE Network, 42–48.

YAVATKAR, R. AND BHAGWAT, N. 1994.Improving end-to-end performance of TCPover mobile internetworks. In Proceedings ofthe Workshop on Mobile Computing Systemsand Applications (Mobile ’94, Dec.).

ZHANG, L 1986. Why TCP timers don’t work well.In Proceedings of the ACM Conference onCommunications Architecture and Protocols(SIGCOMM ’86, Stowe, VT, Aug. 5–7), W.Kosinsky, J. J. Garcia-Luna, and F. F. Kuo,Eds. ACM Press, New York, NY, 397–405.

Received: May 1997; revised: October 1998; accepted: November 1998



Documents

The Transport Layer: Tutorial and Survey - ACM Digital Library