14
Message fragmentation for a chain of disrupted links q Philip Ginzboorg a,b,, Valtteri Niemi c , Jörg Ott b a Huawei Technologies OY, Itämerenkatu 9, FIN-00180 Helsinki, Finland b Aalto University, Otakaari 5A, FIN-02150 Espoo, Finland c University of Turku, Department of Mathematics and Statistics, FIN-20014 Turku, Finland article info Article history: Available online xxxx Keywords: Fragmentation Channel with failures DTN abstract We investigate the problem of estimating the transmission time of fragmented messages over multiple disrupted links. We build a system model for the case where a single message is sent over a chain of links and the disruptions in these links are identically and independently distributed. For this case, we derive approximation formulas for the mean transmission time, based on number of links, length of fragments and distributions of disruptions. The formulas are verified against simulation experiments in the cases of uniform and exponential distributions for disruptions. Ó 2014 Elsevier B.V. All rights reserved. 1. Introduction A challenged network is a network subject to difficult opera- tional constraints, like disrupted links and high delays. Potential applications of challenged networks are in the areas were infra- structure needed for good end-to-end connectivity is difficult or inconvenient to deploy. Those areas include communications in industrial environments (mines, factories, shipyards), space, mili- tary, emerging markets, and local opportunistic communication between mobile devices. Ways of communicating in challenged environments are devel- oped by the research community. One of the most encompassing and well documented efforts is the work done in the IETF Delay Tolerant Networking Research Group (DTNRG). 1 A delay-tolerant network (DTN) can be defined as a network that does not require for its operation (i) small Round-Trip Time (RTT), or (ii) simultaneous end-to-end paths, or (iii) continuous connectivity between nodes [1]. Since communication in challenged environments can be imple- mented from DTNRG specifications, we will use DTN notation in this paper. Please note, however, that our results are not restricted only to those challenged networks that follow the DTNRG specifications. In a challenged network with unstable transmission links the connection between the sender and the receiver may be cut before the entire message has been transmitted. For that reason the con- tact times (i.e. the times when the link between two nodes is avail- able, or: in the ON state) can be a very scarce resource. Allowing messages to be fragmented on their way to the destination may help to use these contact times better. Delay-tolerant networks may use potentially large messages (rather than small packets) as basic transmission unit offered to applications. Here again, sending large messages implies that those will be broken down into individual packets for the actual trans- mission across a physical link that comply with the link’s Maxi- mum Transfer Unit (MTU) size. Such a mechanism is defined, e.g., for the convergence layers of the DTN bundle protocol [2]. Since messages may be large, their transmission as a series of packets may not complete during a contact period. When a link comes up again after a down (‘‘OFF’’) period (the inter-contact time), the message transmission should resume (roughly) where it stopped, rather than have to restart from the beginning. For this purpose, it is required to fragment a message into smaller pieces (‘‘units’’) whose transmission is more likely to fit into a contact period than the complete message. Two types of fragmentation for DTN are defined in [2]: pro-ac- tive and reactive. In the former, the source node for the link divides application data into blocks and sends each block in a separate fragment. In the latter, the data is split only when the transmission between two nodes on any link of the message path is interrupted; resulting in one fragment with data that made it to the receiver and one containing the remainder at the sender. The fragmented data is re-assembled at its destination, but also an intermediate node can re-assemble fragments into a new, bigger piece. We shall assume that the sender does message quantization, i.e. it prepares the message for fragmentation by dividing it into blocks http://dx.doi.org/10.1016/j.comcom.2014.03.015 0140-3664/Ó 2014 Elsevier B.V. All rights reserved. q Expanded version of a talk given at the AOC 2012 workshop in San Francisco, California, USA, on June 25 2012. Corresponding author at: Huawei Technologies OY, Itämerenkatu 9, FIN-00180 Helsinki, Finland. Tel.: +358 504836224. E-mail addresses: philip.ginzboorg@iki.fi (P. Ginzboorg), valtteri.niemi@utu.fi (V. Niemi), [email protected] (J. Ott). 1 URL: http://www.dtnrg.org. Computer Communications xxx (2014) xxx–xxx Contents lists available at ScienceDirect Computer Communications journal homepage: www.elsevier.com/locate/comcom Please cite this article in press as: P. Ginzboorg et al., Message fragmentation for a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/ 10.1016/j.comcom.2014.03.015

Message fragmentation for a chain of disrupted links

Embed Size (px)

Citation preview

Computer Communications xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Computer Communications

journal homepage: www.elsevier .com/locate /comcom

Message fragmentation for a chain of disrupted links q

http://dx.doi.org/10.1016/j.comcom.2014.03.0150140-3664/� 2014 Elsevier B.V. All rights reserved.

q Expanded version of a talk given at the AOC 2012 workshop in San Francisco,California, USA, on June 25 2012.⇑ Corresponding author at: Huawei Technologies OY, Itämerenkatu 9, FIN-00180

Helsinki, Finland. Tel.: +358 504836224.E-mail addresses: [email protected] (P. Ginzboorg), [email protected]

(V. Niemi), [email protected] (J. Ott).1 URL: http://www.dtnrg.org.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation for a chain of disrupted links, Comput. Commun. (2014), http://dx.d10.1016/j.comcom.2014.03.015

Philip Ginzboorg a,b,⇑, Valtteri Niemi c, Jörg Ott b

a Huawei Technologies OY, Itämerenkatu 9, FIN-00180 Helsinki, Finlandb Aalto University, Otakaari 5A, FIN-02150 Espoo, Finlandc University of Turku, Department of Mathematics and Statistics, FIN-20014 Turku, Finland

a r t i c l e i n f o

Article history:Available online xxxx

Keywords:FragmentationChannel with failuresDTN

a b s t r a c t

We investigate the problem of estimating the transmission time of fragmented messages over multipledisrupted links. We build a system model for the case where a single message is sent over a chain of linksand the disruptions in these links are identically and independently distributed. For this case, we deriveapproximation formulas for the mean transmission time, based on number of links, length of fragmentsand distributions of disruptions. The formulas are verified against simulation experiments in the cases ofuniform and exponential distributions for disruptions.

� 2014 Elsevier B.V. All rights reserved.

1. Introduction

A challenged network is a network subject to difficult opera-tional constraints, like disrupted links and high delays. Potentialapplications of challenged networks are in the areas were infra-structure needed for good end-to-end connectivity is difficult orinconvenient to deploy. Those areas include communications inindustrial environments (mines, factories, shipyards), space, mili-tary, emerging markets, and local opportunistic communicationbetween mobile devices.

Ways of communicating in challenged environments are devel-oped by the research community. One of the most encompassingand well documented efforts is the work done in the IETF DelayTolerant Networking Research Group (DTNRG).1 A delay-tolerantnetwork (DTN) can be defined as a network that does not requirefor its operation (i) small Round-Trip Time (RTT), or (ii) simultaneousend-to-end paths, or (iii) continuous connectivity between nodes [1].Since communication in challenged environments can be imple-mented from DTNRG specifications, we will use DTN notation in thispaper. Please note, however, that our results are not restricted onlyto those challenged networks that follow the DTNRG specifications.

In a challenged network with unstable transmission links theconnection between the sender and the receiver may be cut before

the entire message has been transmitted. For that reason the con-tact times (i.e. the times when the link between two nodes is avail-able, or: in the ON state) can be a very scarce resource. Allowingmessages to be fragmented on their way to the destination mayhelp to use these contact times better.

Delay-tolerant networks may use potentially large messages(rather than small packets) as basic transmission unit offered toapplications. Here again, sending large messages implies that thosewill be broken down into individual packets for the actual trans-mission across a physical link that comply with the link’s Maxi-mum Transfer Unit (MTU) size. Such a mechanism is defined,e.g., for the convergence layers of the DTN bundle protocol [2].

Since messages may be large, their transmission as a series ofpackets may not complete during a contact period. When a linkcomes up again after a down (‘‘OFF’’) period (the inter-contacttime), the message transmission should resume (roughly) whereit stopped, rather than have to restart from the beginning. For thispurpose, it is required to fragment a message into smaller pieces(‘‘units’’) whose transmission is more likely to fit into a contactperiod than the complete message.

Two types of fragmentation for DTN are defined in [2]: pro-ac-tive and reactive. In the former, the source node for the link dividesapplication data into blocks and sends each block in a separatefragment. In the latter, the data is split only when the transmissionbetween two nodes on any link of the message path is interrupted;resulting in one fragment with data that made it to the receiverand one containing the remainder at the sender. The fragmenteddata is re-assembled at its destination, but also an intermediatenode can re-assemble fragments into a new, bigger piece.

We shall assume that the sender does message quantization, i.e.it prepares the message for fragmentation by dividing it into blocks

oi.org/

2 In DTNs, paths of successfully delivered messages are often short because thenetwork diameter is naturally constrained (as, e.g., in deep space networks), ormessages do not travel very far in terms of distance and hops (as in mobileopportunistic networks). Therefore, we consider small values of n, say, less than 10, tobe more interesting than large ones. But we do not exclude larger numbers of links inwhat follows.

3 Please note that a sequence of ON/OFF epochs having different average linkspeeds during ON epochs, can be transformed into a sequence having constant linkspeed, that still retains same durations of OFF–ON epoch pairs as the originalsequence. The details of this transformation are described in [8].

2 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

of size f, called ‘‘fragmentation unit’’, and this is done before trans-mission. The message may be fragmented on its way to the desti-nation only along the borders defined by f. This is motivated bysecurity and efficiency considerations. Firstly, since a contact be-tween two nodes may abruptly end, the sender (be it the originat-ing or an intermediate node) must decide on the fragmentationborders and add the appropriate message authentication codes(MACs) before transmitting the message.

Secondly, marshaling and assembly of the message pieces at thedestination is easier with fixed f [3].

Message quantization leads to the question ‘‘How does thetransmission time of a message depend on f?’’ The simplest wayto answer this question in an actual network is by trial and error.For example, in IP networks the maximum size of an IP packet thatcan be transmitted without fragmentation is typically determinedby the path probing technique of RFC1981 [4]. But it is hard to ap-ply this technique in a challenged network where simultaneousend-to-end path from source to destination is unlikely. Another,complimentary way to answer this question is to estimate thedependency from the known network conditions. This is the kindof an answer that we investigate in this paper.

Recent work [5–8] has investigated fragmented messages trans-mission over a single disrupted link, modeling packet or file trans-mission over a wireless link as well as single-hop forwarding ofDTN messages. Scenarios where message is delivered over multiplelinks have not received attention so far.

In this paper, we address the case of message fragmentationover a chain of n disrupted links. This case occurs, e.g., in a staticmulti-hop wireless network, where link disruptions can be dueto interference. We want to estimate the transmission time of asingle fragmented message over an empty chain of disrupted links.The message may be rather long. The reason we chose to study thisscenario is that it seems to capture one essential aspect of whathappens in DTN.

We define a basic model for message transmission over n linksin Section 2. The disruptions of communication links in the chainare characterized by i.i.d. ON/OFF periods; the chain is homoge-neous in space and time and its links work independently fromeach other. While the homogeneous chain with identical distribu-tions of disruptions is interesting and mathematically tractablemodel, none of the practical, actual setups follow exactly all ourassumptions. But nevertheless, the model is useful in understand-ing these practical setups.

In Section 3 we first identify the natural lower and upperbounds on the mean transmission time over n links. Then we de-rive a generic approximation formula for the mean transmissiontime. Estimates of the queue sizes in intermediate nodes areneeded to compute this formula. In Section 4 we show how tocompute these estimates in the cases of uniform and exponentialdistributions for disruptions. Using these results we can estimatethe mean transmission times of fragmented messages in thosecases. We stress that while we are using these kind of disruptionsto test our formulas, our methods are not restricted to disruptionshaving exponential or uniform distributions: when the distributionof disruptions has finite mean and variance, the mean transmissiontimes of fragmented messages can be estimated using ourmethods.

We use relatively simple tools (e.g., one-dimensional randomwalk and one-step analysis), and try to get as simple as possibleformulas. It is possible that even better approximations could beachieved with more refined tools from queueing theory wherequeues are connected in tandem. Note, however, that queuing the-ory results are typically about the steady state (long-term behav-ior) of the system, e.g., the usage of Palm calculus is based onthis assumption. These kind of results are not applicable in our casebecause we are investigating transient behavior. In some sense, the

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

only steady state of our system is the trivial case of the initiallyempty chain.

To confirm our analysis we have computed the relative error be-tween mean transmission times estimated with our formulas, andthe actual transmission times in a simulated environment, wheremessages are transmitted according to our model over ten dis-rupted links. From those experiments we conclude that ourapproximation is suitable for large message sizes, that are at leasta few times bigger than what can be typically transmitted within asingle contact time; the (relative) accuracy of our estimates in-creases with the message size, and decreases as we move fartherfrom the source node along the chain.

In Section 5 we derive an alternative approximation method fortransmission time that works well for small messages containingonly few fragments. The (relative) accuracy of that approximationincreases as we move farther from the source node along the chain.

Still in the same Section 5 we derive a recursive lower boundformula for transmission time that works best for small messagesdivided into many tiny fragments. The tightness of the lowerbound decreases when we increase the number of links or the frag-mentation unit size.

It can be argued that very tiny messages need not be frag-mented at all; if the whole message typically fits into a single con-tact time there is no point to divide it into pieces. We have a simpleapproximation formula in Section 3 for the transmission time inthis case as well.

We discuss our results in Section 6 and conclude in Section 7.Appendix A contains the justification for the inequality (7) ofSection 3.

For ease of reference, we summarize methods for estimating themean transmission time over a chain of disrupted links in Table 3.

2. System model

The model used to obtain the analytical results is as follows.Network node A sends messages over a chain of n communicationlinks to node B. Nodes are numbered 0;1; . . . ; n; node 0 is the sen-der A and node n is the receiver B.2 The links change their state be-tween ON and OFF independently from each other in a randommanner. This arrangement is illustrated in Fig. 1.

The link speed during the ON state is constant (and the same)for all links.3 We divide all message sizes by the (constant) linkspeed, measuring message sizes in seconds. The message size is de-noted with x. In particular, for example, when a link is continuouslyin ON state, a message of length x would be transferred over the linkin x seconds. The electromagnetic signal’s propagation times and thetime it takes to acknowledge transmission over one link are ne-glected (zero) in our model. We also assume that nobody else (ex-cept node A) is sending messages over the chain of communicationlinks. Furthermore, we assume that the link state durations have fi-nite mean and variance.

The sending node A can choose to transmit the message in a sin-gle unit, thus requiring sufficiently long contact durations for thewhole message to fit. Alternatively, A may split the message intoblocks of size f, thus allowing transmission of message fragmentsconsisting of one or more such (equal sized) blocks during shorter

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

Fig. 1. Schematic illustration of a chain of three disrupted links between the senderA (node 0) and the receiver B (node 3). T1ðxÞ; T2ðxÞ, and T3ðxÞ are the meantransmission times of a message having size x over one, two, and three links,respectively. Q1ðx; tÞ and Q2ðx; tÞ model the average amount of data queued in theintermediate nodes 1 and 2 at time t.

P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx 3

contacts. f may vary between messages, and we call it ‘‘messagefragmentation unit’’. Typical values of f could be 10�3; 10�2; 1,all measured in seconds.

We assume, for simplicity, that message size x is an integralmultiple of fragmentation unit f.

If the kth link in the chain is disrupted (fails) during transmis-sion, node k� 1 will attempt to retransmit the remaining messagefragmentation units during the next ON epoch.

While the remainder of the message fragmentation units is stillbeing transmitted over the first link, its head part consisting offragmentation units already received in the first intermediatenode, may be transmitted in parallel over the subsequent links.The same holds for all other links, but note that data bits insidethe same fragmentation unit can be transmitted only over one linkat a time; a node starts the transmission of a block, only if it hasreceived all of that block.

We assume that the message transmission’s starting time t ¼ 0is a random point in the sequence of ON/OFF periods. We denotetime by t, and by tkðxÞ the moments at which the transmission ofa message with size x over kth link completes. We also agree thatt0ðxÞ ¼ 0.

The mean of tkðxÞ is denoted with TkðxÞ. When talking about themean transmission time over a single link, we often omit the sub-script; and write TðxÞ rather than T1ðxÞ.

The distributions of the link state durations (ON and OFF) are(i) the same in all links, and (ii) do not depend on time. Relaxingthese assumptions to generalize our model is left for futurestudy.

We denote with qkðx; tÞ the amount of data queued in node kat time t during transmission of message size x. The mean ofthat random process at time t is Qkðx; tÞ. The limit of qkðx; tÞwhen x!1 and k P 1 describes the saturated input case,and is denoted by qkðtÞ. It can be thought of as the kth queuesize at time t, when the message size x is very large—much lar-ger than any message we may wish to send over a chain oflinks. The mean of that random process at time t is QkðtÞ; weillustrate the difference between Q1ðtÞ and Q1ðx; tÞ in Fig. 2.From the system model it is clear that Qkðx; tÞ ¼ Q kðtÞ wheneverx > t because not more than t first seconds of the message mayhave been transmitted over any link at the time moment t. (This

Fig. 2. Schematic illustration of the average queue sizes Q1ðx; tÞ and Q1ðtÞ.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

holds even in the limiting case where all links are in ON-stateall the time.)

3. Estimating the mean transmission time TnðxÞ

Consider the mean transmission time Tn over n links.It follows from our system model that the mean transmission

time of a message over k links is less than or equal to the sum ofthe mean transmission time TkðxÞ of all message parts over the firstk� 1 links, and the mean transmission time T ðxÞ (again, of all mes-sage parts) over the last, kth, link.

TkðxÞ 6 Tk�1ðxÞ þ T ðxÞ; ð1Þ

where k ¼ 1;2; . . . ;n.The equality in Eq. (1) holds only if the transmission over the

kth link cannot start until all of the message parts have beenreceived in the ðk� 1Þth node. This limiting case appears, forexample, in transmission of a totally unfragmented message—thenthis kind of dependency exists between all adjacent links, andTnðxÞ is the sum of TðxÞ values over individual links. Tiny messagesthat are much shorter than the expected length of one ON epochare examples of messages that would typically be sentunfragmented.

This limiting case gives us the natural upper bound on the meantransmission time TnðxÞ.

The trivial lower bound on TnðxÞ is given by another limitingcase, where the n links are fully synchronous, i.e. they go ON andOFF simultaneously. Here the mean transmission time over n syn-chronous links equals that over a single link4:

TnðxÞ ¼ T ðxÞ: ð2Þ

Combining upper and lower bounds we have:

TðxÞ 6 TnðxÞ 6 nTðxÞ: ð3Þ

The upper bound in Eq. (3) implies that the growth of the meantransmission time with the number of links n is sub-linear.

In the rest of this section we derive an approximate formula forthe mean transmission time in other cases (i.e. ‘‘in between’’ thosebounds).

Transmission time tkðxÞ over k links, is the time tk�1ðxÞ it takesto transmit the whole message over the first k� 1 links, plus thetime it takes to transmit the remaining data over the last, kth, link.Recall that qkðx; tÞ is the amount of data queued in node k at time tduring transmission of message size x. The size of the remainingdata in the ðk� 1Þth node at the time tk�1ðxÞ is qk�1ðx; tk�1ðxÞÞ,and therefore:

tkðxÞ ¼ tk�1ðxÞ þ tðqk�1ðx; tk�1ðxÞÞÞ; k ¼ 2;3; . . . ;n: ð4Þ

In our approximation of TnðxÞwe take k ¼ n and replace all vari-ables with the respective means. The result is

TnðxÞ � Tn�1ðxÞ þ TðQn�1ðx; Tn�1ðxÞÞÞ: ð5Þ

We then replace the mean queue size Qn�1ðx; Tn�1ðxÞÞ in (5) with themean queue size Qn�1ðTn�1ðxÞÞ in the case of saturated input, be-cause the latter is easier to estimate. The result is a recursive for-mula that, together with the combined bounds of Eq. (3),characterizes TnðxÞ:

TnðxÞ � Tn�1ðxÞ þ T ðQ n�1ðTn�1ðxÞÞÞ: ð6Þ

Intuitively, we would expect our estimate to be at least as highas the actual TnðxÞ, because we are replacing Q n�1ðx; Tn�1ðxÞÞ with

4 Note that the lower bound could be tightened, because our system modelassumes that fragmentation units cannot be forwarded by an intermediate nodebefore they are entirely received. This causes an additional systematic per-hop delaythat is a function of the chosen fragmentation unit size f.

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

20 40 60 80 100

100

200

300

400

Exponentially distributed ON/OFF epochs

Message size x [s]

Mea

n tra

nsm

issi

on ti

me

[s]

f = 1 [s]

f = 1 [ms]

20 40 60 80 100

100

200

300

400

Uniformly distributed ON/OFF epochs

Message size x [s]

Mea

n tra

nsm

issi

on ti

me

[s]

f = 1 [s]

f = 1 [ms]

Fig. 3. Illustration of the asymptotic linearity of the fragmented message’s meantransmission time TðxÞ over a single link when the ON and OFF periods of the linkare exponentially or uniformly distributed and have the mean of one second. Thefragmentation unit sizes are f ¼ 10�3 s and f ¼ 1 s. Each dot ‘‘�’’ represents themean transmission time in simulated environment of 2000 messages having size x.

4 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

the potentially larger Q n�1ðTn�1ðxÞÞ.5 But what is the effect of replac-ing random variables by their means? In Appendix A we explain whyour estimate is typically at least as high as the actual TnðxÞ:

TnðxÞ 6 Tn�1ðxÞ þ TðQ n�1ðTn�1ðxÞÞÞ: ð7Þ

The main assumption needed for this inequality is that themean transmission time of a fragmented message over a single linkTðxÞ is a linear, increasing function of x. This is asymptotically truefor any typical distribution of ON and OFF epochs with finite meanand variance (see Eq. (V.12) in [7]).

We illustrate this fact in Fig. 3.Another assumption needed for the inequality is that the func-

tion Q n�1ðtÞ is concave; we will show in Appendix A that this is alsotypically the case after some initial time period.

Please note that variants of Eq. (1)–(7) hold also in the casewhen the ON–OFF periods’ statistics are not the same in all links.

We now proceed to the question of computing estimates basedon Eq. (6). To compute this recursion we need to know how to esti-mate (i) the mean transmission times over a single link TðxÞ (forany message size x), and (ii) the mean queue size Q kðtÞ in interme-diate nodes.

We already know how to do (i): The transmission time over asingle link TðxÞ can be estimated using the results of our previouspaper [7]. Please note, however, that there is a small difference be-

5 Remember that the mean size Qn�1ðtÞ of saturated-input queue grows always,while the mean size Qn�1ðx; tÞ of queue with limited input starts to decline after thesource runs out of data to transmit.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

tween the models of this paper and paper [7]. In the latter it is as-sumed that transmission starts always at the beginning of an ONepoch whereas in the current paper this assumption is not done.This difference is significant if x is small and we discuss it furtherlater.

For the derivations below we restrict ourselves to the caseswhere the ON and OFF epochs are either uniformly or exponen-tially distributed with the mean of one second in both cases.

Assuming that the mean is exactly one second is not a realrestriction. If the mean contact time would be something else, thenwe could use that something else as the unit of time, instead ofcounting time in seconds. By this change of units all formulas inthe following would still hold. But we have to be careful; pleasenote that we also count message sizes (and fragmentation unitsizes) in seconds. For example, if the unit of time would be 5 s thenthe unit of message size would be the amount of bits that can betransferred during 5 s with our constant link speed.

For the limiting case where the fragmentation unit f is thesmallest possible, we get a simple estimate by TðxÞ ¼ 2x becausethe link is in the ON state half of the time (in average) and it is pos-sible to utilize the whole of each ON epoch for transmitting mes-sage fragments. This simple approximation ignores the fact thatwhen transmission starts during an OFF epoch, this first OFF epochis expected to be longer than the average length of OFF epochs.(This is due to the well-known residual life paradox [9, Sec-tion 5.2].) But especially for longer messages, the effect of this errorseems to be rather small.

For the cases where the fragmentation unit f is larger, only partof ON epochs can be utilized for transmission, and consequentlyTðxÞ becomes longer. For the uniform distribution we obtain a rel-atively good approximation by simply including in the calculationonly those ON epochs that are long enough for accommodating atleast one fragmentation unit f. For example, in the case of f ¼ 0:1 sthe proportion of those epochs is ð2� 0:1Þ=2 ¼ 19=20, andTðxÞ ¼ 2x=ð19=20Þ ¼ ð40=19Þx; for f ¼ 1s we get TðxÞ ¼ 4x.

This kind of straight-forward approximation does not work aswell for the exponential distribution, but slightly more complexreasoning (see the derivation of Eq. (V.17) in [7]) shows that themean number of ON epochs needed for transmitting x in this caseis approximately x=fd e e1�f � 1

� �. Multiplying this by two seconds—

the mean size of a single ON–OFF cycle, we get the followingapproximations for the same example fragment sizes: forf ¼ 0:1 s we have TðxÞ ¼ 20xðe0:1 � 1Þ and for f ¼ 1 s we haveTðxÞ ¼ 2xðe� 1Þ.

Note that in all of these cases, TðxÞ is a linear function of x.In Section 4 we will go into the question (ii) of estimating QkðtÞ.

4. Estimation of mean queue sizes Q kðtÞ

In this section we give estimates of queue sizes Q kðtÞ in inter-mediate nodes in the cases of exponential and uniform distribu-tions for disruptions. We begin by summarizing the results inSection 4.1, then outline the estimation method in Section 4.2,and end with the derivation details in Section 4.3.

4.1. Summary of the results

Our approximation for QkðtÞ builds on the following estimatefor the queue in the first intermediate node:

Q1ðtÞ � cffiffitp; ð8Þ

where the constant c depends on distribution of ON and OFF epochs.We will show how to derive c for various distributions and give alsoexplicit formulas for c in cases where ON and OFF epochs are dis-tributed exponentially or uniformly.

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx 5

We further have

Q kðtÞ � akQ 1ðtÞ; k ¼ 1;2; . . . ; n

with a1 ¼ 1 and the constants ak tabulated below for k up to 10:

P1

k

lease c0.1016

2

ite thi/j.com

3

s articlecom.20

4

in pre14.03.0

5

ss as: P15

6

. Ginzb

7

oorg e

8

t al., M

9

essage

10

fragme

ak

:63 :50 :42 :36 :33 :29 :27 :25 :23

6 By the inequality (7) the estimated mean transmission time should be asymp-totically above the actual one, but we see in Fig. 4 that as the number of links growsthe estimated mean transmission time is less than the actual one. This is becauseinequality (7) holds for exact values of TðxÞ and QkðtÞ while we use simpleapproximation formulas for both TðxÞ and QkðtÞ.

Our approximation model is built in such way that these con-stants depend only on the means of the ON and OFF epochsdistributions.

Combining these approximation formulas for the queue sizeswith the approximations for TðxÞ from the previous section, weget the following formulas:

TkðxÞ � Tk�1ðxÞ þ bk�1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiTk�1ðxÞ

p; k ¼ 2; . . . ;n; ð9Þ

where bk are constants that depend on the distribution of disrup-tions. For the exponential distribution with a mean of 1 s we have

bk ¼2ak=

ffiffiffiffipp

; f ¼ 10�3 ðsmallest unitÞ;20ðe0:1 � 1Þak=

ffiffiffiffipp

; f ¼ 10�1;

2ðe� 1Þak=ffiffiffiffipp

; f ¼ 1

8><>:

and for the uniform distribution (again, with a mean of 1 s) we have

bk ¼ð2=3Þ2ak=

ffiffiffiffipp

; f ¼ 10�3;

ð2=3Þð40=19Þak=ffiffiffiffipp

; f ¼ 10�1;

ð2=3Þ4ak=ffiffiffiffipp

; f ¼ 1:

8><>:

To confirm our analysis we have written a custom simulator inC; it implements transmission of a message over a chain of n linksaccording to our model. The above formulas are for the fragmenta-tion unit sizes that we have implemented in the simulator. For gen-eral f the formulas are as follows:

bk ¼ 2 � ef � 1� �

� ak=p;

when the disruptions are exponentially distributed, and

bk ¼23� 1� f

2

� �� ak=p;

when the disruptions are uniformally distributed.We summarize next the simulation results in the 4-link and 10-

link cases, for both exponential and uniform distributions of ON/OFF epochs, and three fragmentation unit sizes f ¼ 10�3 s, 10�1 s,and f ¼ 1 s. The mean value of the ON (or OFF) epochs is 1 s, inall cases; the message size x varies between 2 s and 48 s.

Each simulation run consisted of 100 single message transmis-sions. The 100 messages transmitted in a run where all are of thesame size x. The start of the first message’s transmission in eachrun is chosen uniformly at random between zero and 20 s.Transmission of subsequent messages begins at a point chosenuniformly at random within 20 s from the end of the previoustransmission (i.e. the time at which the previous message hasbeen delivered to the destination). For each set of parameters wecarry out 20 simulation runs with different random seeds. In thisway we get a sample of 2000 transmission times for each messagesize.

The results for x ¼ 48 s and comparison to the estimates givenearlier in this section are plotted in Fig. 4, left and middle parts.The data points corresponding to f ¼ 10�1 s are omitted fromFig. 4 for clarity. It can be seen that the mean transmission timeis concave with respect to the number of links k. The approxima-tion for x ¼ 48 s is very good, except in the case of exponential dis-tribution and f ¼ 1 s. But we can see from Table 1 that even in thiscase the absolute value of the error is less than 5% for k ¼ 4 (Table 1

ntation f

top), and less than 10% for k ¼ 10 (Table 1 bottom). The case ofexponential distribution and f ¼ 1 s is shown again in the rightplot of Fig. 4 together with the inter-quartile range and the medianof the actual transmission times: the approximated values staywithin the inter-quartile range up to the sixth link in that case.

In Table 1 we give the biggest (in absolute value) relative errorvalues for message sizes 2, 4, 8, 12 and 48 s when the message istransmitted over four and ten links. The relative errors were com-puted as the difference between our approximation for TkðxÞ andthe actual (simulated) value of TkðxÞ, divided by the actual valueof TkðxÞ. We see that: (i) The maximum absolute value of the errorbetween our approximation and the actual TnðxÞ decreases as themessage size grows larger, on the one hand; and it increases withthe number of links, on the other hand. The (maximal) errors are ingeneral bigger for the exponentially distributed disruptions, thanfor the uniformly distributed disruptions.6 (ii.a) When the numberof links is 4, the threshold of �10% in the relative error value occursbetween x ¼ 4 s and x ¼ 8 s for the exponential case; and betweenx ¼ 2 s and 4 s in the uniform case. (ii.b) When the number of linksis 10, the threshold of �10% in the relative error value occurs be-tween x ¼ 12 s and x ¼ 48 s for the exponential case; and betweenx ¼ 8 s and 12 s in the uniform case.

In summary, our approximation is good for large message sizesthat are at least a few times bigger than the mean time betweendisruptions. The reason for this limitation is that the messagetransmission must last over several ON/OFF epochs before our ran-dom walk model for queue sizes in intermediate nodes (describedbelow) begins to apply sufficiently well. Another limitation of ourrandom walk model is apparent, e.g. in case where f ¼ 1 s whilex ¼ 4 s. Now we have only four message fragments in the systemwhich implies that, for the case of ten links and at any point intime, queues must be empty in most intermediate nodes. Theseexamples illustrate the need for different approximation formulasin cases where either the total number of message fragments issmall or the length of the whole message is small when comparedto the mean of a single ON epoch.

We will provide such an alternative method in Section 5.

4.2. Estimation method

Recall that Q kðtÞ describes in the transient behavior of the sys-tem with saturated input. (The input is saturated because the sen-der always has more data to send.) In some sense the only steadystate of such system is the trivial case of the initially empty system.For that reason, steady-state solutions of queueing networks arenot directly applicable in our case.

We use the concept of a random walk in our estimations ofQkðtÞ; this is commonly used in queuing theory [10].

A queuing process can be coupled with that of a one-dimen-sional ‘walk’ in the following way: The walker takes only a singlestep of size �1 in each move. The amount of queued data is zeroat the initial position of the walker. After the walk starts, theamount of queued data increases by a constant l if the walkermakes a move to the right (i.e. when the position of the walker in-creases by 1); and it decreases, again by l, if the walker makes amove to the left (i.e. when the position of the walker decreasesby 1). However, there is an additional important rule: the amountof the queued data is always non-negative; e.g., if the queue isempty and the walker makes a move to the left then the size ofthe queue remains as zero.

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

2 4 6 8 10

050

150

250

Uniformly distributed ON/OFF epochs

Distance (links) from the sender

Mea

n tra

nsm

issi

on ti

me

[s]

1 2 3 4 5 6 7 8 9 10

f = 1 [s]

f = 1 [ms]

2 4 6 8 10

050

150

250

Exponentially distributed ON/OFF epochs

Distance (links) from the sender

Mea

n tra

nsm

issi

on ti

me

[s]

1 2 3 4 5 6 7 8 9 10

f = 1 [s]

f = 1 [ms]

2 4 6 8 10

050

150

250

Exponentially distributed ON/OFF epochs

Distance (links) from the sender

Tran

smis

sion

tim

e [s

]

1 2 3 4 5 6 7 8 9 10

f = 1 [s]

Fig. 4. The left and middle plots show the actual ‘‘�’’, and the approximated ‘‘�’’ values of the mean transmission time Tkð48 sÞ for ten links and two fragmentation unit sizesf : 10�3 s, and 1 s. The mean duration of ON and OFF epochs in all links is 1 s. On the right plot we repeat the approximation ‘‘�’’ to the mean transmission time Tkð48 sÞ in thecase of f ¼ 1 s and exponentially distributed disruptions. Among the four cases shown in the left and middle plots, this is the one where our approximation diverges the mostfrom the actual mean. In this plot we add also the inter-quartile range and the median of the actual transmission times tkð48 sÞ. The upper and the lower tips of each gray barindicate, respectfully, the 75% and the 25% quantiles; the bar’s mid tick is the position of the median.

Table 1Biggest (in absolute value) relative error, in %, between the actual and the estimatedmean transmission times TkðxÞ over four and ten links.

6 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

Let us denote with sk the position of the walker relative to itsstarting point after k moves, and with mn the minimum of sk aftern moves:

mn ¼ min0<k6n

ðskÞ:

It can be shown from the coupling above that the amount ofqueued data is lðsn �mnÞ. The mean queue size is, therefore,

l � E½sn� �Mnð Þ; ð10Þ

Fig. 5. Combined states between two consecutive links.

where Mn denotes E½mn�.Also, the number of time points when the corresponding queue

is empty equals to �mn.In our derivation below, we define a walk coupled with q1ðtÞ

that makes nðtÞ moves up to time t, and compute an estimate ofE½snðtÞ�, and MnðtÞ for that walk. An approximation of Q 1ðtÞ is ob-tained then from Eq. (10).

This procedure is repeated for each subsequent queue in theintermediate nodes, up to Q 10ðtÞ. The procedure for subsequentqueues Q 2ðtÞ; Q 3ðtÞ etc. is more complex than for the first queue,because the walk is affected by the cases when the previous queueis empty. At these time points the succeeding queue cannot everincrease. We will go into details in the next section.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

4.3. Derivation details

Whenever there is an ON epoch on the first link, then at least apart of the message can also be transferred. In other words, duringthe message transmission over the first link the queue in the send-ing node A behaves in a simple way: q0ðx; tÞ is decreasing for allvalues of t that belong to an ON epoch (of the first link) andq0ðx; tÞ is stable for all values of t that belong to an OFF epoch (ofthe first link).

Let us consider now q1ðtÞ, the length of the queue in the firstintermediate node when we assume that the source node alwayshas more data to transmit. In a certain sense, we have a relativelysimple situation.

There are four possible cases for the behavior of q1ðtÞ dependingon whether the first link/second link is either ON or OFF. For valuesof t where the combined state of the first two links is either ON/ON,or OFF/OFF, the queue length q1ðtÞ remains stable. For values of twhere we have ON/OFF, the queue becomes longer while, forpoints in time where we have OFF/ON, the queue becomes shorter(unless it is already empty).

Next we show how the concept of a combined state can beturned into a random walk model. Let us first exclude the possibil-ity that two consecutive links would change their states in exactlythe same moment in time. If the states have a continuously distrib-uted duration then this exclusion is not necessary (because theprobability of this kind of coincidence would become zero). Weare ultimately dealing with discrete distributions but it is clear(at least when the granularity of time is fine enough) that thisexclusion does not introduce a big error.

A sequence of ON/OFF states in two consecutive links is illus-trated in Fig. 5. We notice that every second combined state (time)interval keeps the queue size q1ðtÞ unchanged. From the othercombined state intervals, typically every second (altogether everyfourth) would increase the queue size while the rest (altogetheralso every fourth) would decrease the queue size. This motivates

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

Fig. 6. Illustration of the queue qkðtÞ, and the related random walk with negativedrift.

P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx 7

grouping four consecutive combined state intervals together. Anexpected length of such group is two seconds in case the meanof each epoch is one second.

Now the behavior of the random variable q1ðtÞ can be modeledby a random walk where we would take a random walk moveevery 2 s. The length of the random walk move depends on thelength difference between the ‘increased queue’ time intervaland the ‘decreased queue’ time interval. The model is based onthe assumption that these differences for two consecutive groups(each consisting of four consecutive combined states) do not de-pend on each other too much.

The distribution of the above length difference depends on thedistributions of ON and OFF patterns. On this aspect, uniform andexponential distributions behave differently. If we have two i.i.d.variables U and V whose mean is one second and the distributionis uniform (resp. exponential) then (it is easy to calculate that)the absolute difference jU � V j has the mean of 2=3 s (resp. 1 s).Although the combined state intervals do not have exactly uniform(resp. exponential) distributions, this however gives a pretty goodapproximation for the purposes of our model.

Summary of the model for q1ðtÞ is the following. We have a ran-dom walk model for the queue size q1ðtÞ in which one move is ta-ken every 2 s to the left or to the right with equal probability; nðtÞis approximately t/(2 s); the length of the move is 1 s in the case ofexponential distribution of ON/OFF epochs, and it is 2=3 s in thecase of uniform distribution. We denote this length with l. As dis-cussed in the previous section, an additional constraint is imposedby the natural fact that the queue size is never negative. Randomwalk theory provides now formulas for the mean queue sizeQ 1ðtÞ (and also for frequencies of empty queue instances).

We return to these formulas in a moment, but first we note thatthe situation is slightly different for the next queue q2ðtÞ. When-ever the first queue is empty, i.e. q1ðtÞ ¼ 0, then the queue sizeq2ðtÞ cannot increase. This implies a negative drift in the walk re-lated to q2ðtÞ. The same is true for q3ðtÞ etc. Moreover, the negativedrift implies smaller average queue sizes Q2ðtÞ (when compared toqueue sizes Q1ðtÞ).

Also, the queue in the second intermediate node is more oftenempty than the queue in the first intermediate node, which in turnimplies even bigger negative drift for random walk related to q3ðtÞetc.

As a conclusion of this reasoning, it seems that in the interme-diate nodes QkðtÞ > Qkþ1ðtÞ. (Please recall here that we definedQ kðtÞ in such way that the queue in the source node is neverempty.) We show in the A that the inequality QkðtÞ > Q kþ1ðtÞ in-deed holds for every k P 1 and every t > 0.

Now we turn our attention to the actual formulas for queuesizes QkðtÞ. For the simple random walk the mean position of thewalker relative to its starting point after n moves E½sn� is zero,and the ratio between the expected minimum Mn and

ffiffiffinp

tendsto the constant �

ffiffiffiffiffiffiffiffiffi2=p

pas n grows.7 Thus Mn � �

ffiffiffiffiffiffiffiffiffiffiffiffi2n=p

p. In our

case nðtÞ � t=ð2sÞ, and so MnðtÞ is approximately �ffiffiffiffiffiffiffiffit=p

p, where p

has the dimension of seconds. Therefore, by Eq. (10)

Q 1ðtÞ � lffiffiffiffiffiffiffiffit=p

p:

The constant c in Eq. (8) is l=ffiffiffiffipp

and has the dimension s1/2.The value

ffiffiffiffiffiffiffiffit=p

pis also an estimate to the expectation for the

number of time points when the queue in the first intermediatenode q1ðtÞ is empty. As explained above, this fact gives us themeans to estimate the negative drift that affects the mean queue

7 By symmetry considerations the �Mn equals to the expected maximum of simplerandom walk. The limit for the ratio between the expected maximum and

ffiffiffinp

is givenin, e.g., [11] p. 235, Eq. (4.8.23).

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

size in the second intermediate node Q 2ðtÞ. The drift is referredto by �d

ffiffitp

in Fig. 6.Unfortunately, the drift is a nonlinear function of t, and there-

fore q2ðtÞ cannot be modeled directly by a (proper) random walk.Instead, we model q2ðtÞ with a sum of a simple random walk (likethe one in the model of q1ðtÞ) and the nonlinear negative drift func-tion �

ffiffiffiffiffiffiffiffit=p

p.

The expected end point E½snðtÞ� of this combined walk is �ffiffiffiffiffiffiffiffit=p

p,

and by the formula (10):

Q2ðtÞ � l � �ffiffiffiffiffiffiffiffit=p

p�MnðtÞ

� �: ð11Þ

It is a little bit tricky to estimate MnðtÞ, because it depends on thenonlinear drift component. On the one hand, the nonlinear nega-tive drift naturally pushes the time point, say c, where the mini-mum mnðtÞ is reached closer to the end point t, compared to thecase where c is determined by the random walk component alone.It can be shown that without the drift component

C ¼ E½c� � t=2:

But it is natural to expect that C > t=2 when the walk has a negativedrift.

On the other hand, this kind of bias in the ‘‘selection’’ of mini-mum point c restricts the amount that the random walk compo-nent contributes to the queue size q2ðtÞ. Please notice here thatthe minimum point c is also the last point in time when the queueis empty. (If the minimum point would depend only on the driftthen, of course, we would have c ¼ t, and consequently q2ðtÞ ¼ 0,hence the contribution of the random walk component would bezero.) The component of the minimum mnðtÞ contributed by thenegative drift is marked as �b

ffiffitp

in the Fig. 6.We try to approach these two aspects sequentially, starting by

an estimate for c. Based simply on the observation that c is morelikely to be closer to t than to 0, we use a triangular approximationfor the probability density function of c between 0 and t that growslinearly from 0 to t. In other words, we assign density of 0 for thetime point s ¼ 0 and increase the density function linearly towards2=t at final time point s ¼ t. Now it can be computed thatC � ð2=3Þt.

Using this estimate for C we next try to estimate how big is thedeviation of MnðtÞ from the negative drift function at point C. Thecloser C is to t, the smaller is the absolute value of this expecteddeviation.

Again, we try to find a simple estimate for this deviation. Anatural choice for an estimate is given by the value�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2ð1� C=tÞ � ðt=pÞ

p: The deviation corresponding to C ¼ t=2 can

be approximated by the expected minimum of the random walkcomponent, �

ffiffiffiffiffiffiffiffit=p

p; and as C increases towards t, the absolute

value of the approximated deviation decreases proportionally to-wards zero.

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

8 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

These two simple estimates can now be put together to find anestimate for the expected minimum MnðtÞ of the combined walk,and subsequently also for Q 2ðtÞ: At C ¼ ð2=3Þt the drift, as wellas the deviation, equal to �

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið2=3Þðt=pÞ

peach. Therefore,

MnðtÞ � �2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið2=3Þðt=pÞ

p, or �1:63

ffiffiffiffiffiffiffiffit=p

p. By Eq. (11),

Q 2ðtÞ � lð0:63ffiffiffiffiffiffiffiffit=p

pÞ:

For the next node and the queue size Q 3ðtÞ we repeat similarcalculations. The negative drift �d

ffiffitp

is now bigger (namely,�1:63

ffiffiffiffiffiffiffiffit=p

p) and therefore C should be closer to t.

We try to find an estimate for C for this case by fine-tuning thetriangular approximation to the probability density of c a little bit.This fine-tuning is illustrated in Fig. 7: now it is even less likelythat the minimum point c would be close to the starting point ofthe walk, and therefore, the probability density function for verysmall positive values of c is approximated by zero.

More specifically, we assign density of 0 to all points s 2 ½0; t�where 1:63

ffiffiffisp¼ d

ffiffiffisp

<ffiffiffiffiffiffiffiffit=p

p. (This choice is somewhat justified

by the fact that at these points the magnitude of the negative driftdffiffiffisp

is still smaller than the magnitude of the random walk compo-nent’s expected minimum

ffiffiffiffiffiffiffiffit=p

p.) After this adjustment, the prob-

ability density function of c is 0 for s 6 s0 ¼ ð1� 1=ðffiffiffiffipp

dÞÞ2t, andthe density function increases linearly from 0 at s0, towards2=ðt � s0Þ at time point t.

Now it can be computed that C � 0:717t. The rest of the calcu-lation is similar to the one already described above in the Q2ðtÞcase, resulting in estimates MnðtÞ � �2:14

ffiffiffiffiffiffiffiffit=p

p, and

Q 3ðtÞ � lð0:50ffiffiffiffiffiffiffiffit=p

pÞ:

For the subsequent queue sizes we can repeat similar calcula-tions as were done for Q3ðtÞ; increasing the parameter s0 for eachqueue. These calculations lead to the end results summarized ear-lier in Section 4.1.

One point worth noticing is that our approximation formulasfor queue sizes do not depend on the fragmentation unit f. We havealso verified this by the simulation data: the queue sizes are indeedapproximately the same for different values of f.

Fig. 7. Simple triangular approximation to the probability density function of theminimum position c of the walk up to time t. In the walk associated with the secondqueue q2ðtÞ, the triangle starts at s0 ¼ 0; in the walk associated with the third queueq3ðtÞ , the triangle starts at s0 ¼ 0:12t; and in general, s0 increases as we move tothe right along the chain.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

5. Further estimates for the mean transmission time of smallmessages

As discussed earlier in the end of Section 4.1, it is clear from Ta-ble 1 that our approximation does not perform well in cases wheref ¼ 1 s and the message is short, e.g., x 6 12 s. In these cases theapproximation may underestimate the transmission time by evenmore than 20% (cf. Table 1 bottom part). It was also explained ear-lier why it is not surprising that our approximation fails in thesecases (i.e. it is based on queues in intermediate nodes and smallnumber of fragments cannot create big queues).

In Section 5.1 we provide an alternative estimation method forcases where the number of fragments is small. The new estimate isalso based on a random walk model but now it is not the queuesizes that follow such model but instead we consider distances,measured in the number of links, between different fragments.

In Section 5.2 we study the case where number of fragments isbig (while the fragmentation unit is tiny). We derive a lower boundestimate for the transmission time.

5.1. Small messages with few fragments

Let us begin by taking a closer look at the case where f ¼ 1 s andx ¼ 2 s, i.e. we can have at most two fragments. We shall assume inour derivation that already at the sending node the message is splitinto fragments of size f ¼ 1 s each. Now the distance between thetwo fragments, measured in the number of links, behaves like arandom walk with the exception that it can never be negative. Thisrestriction follows from the fact that the second fragment cannever go ahead of the first one.

When the two fragments are in parallel waiting for a chance tobe transmitted over two different links, one of the following fourcases would occur:

(1, 2) exactly one of the fragments proceed (2 cases), or(3) both fragments proceed, or(4) neither of the two proceeds.

In the first two cases the distance between the two fragmentseither increases by one or decreases by one. In the latter two casesthe distance remains the same. Because the links are independentand statistically identical, each case is equally probable.

Using similar argumentation that was provided in Section 4 forthe model of q1ðtÞ queue size, we may now estimate the averagedistance between the two fragments at the time when the firstfragment has been transmitted over n links. The random walkmodel provides the estimate

mean distance �ffiffiffiffiffiffiffiffiffiffiffiffi2n=p

p:

The total time needed before the first fragment has traveledover n links is Tnð1 sÞ ¼ nTð1 sÞ where Tð1 sÞ is the expectedamount of time needed to get one fragment of length one secondover one link.

Now our new estimate is ready:

Tnð2 sÞ � nþffiffiffiffiffiffiffiffiffiffiffiffi2n=p

p� �Tð1 sÞ:

In our original approximation we used simple formulas for TðxÞ.Those formulas are adequate for bigger values of x but for small xthey are not appropriate because the transmission is assumed to bestarted at random point in time and this has an effect on TðxÞ. Nextwe present fairly straight-forward calculations that give us accu-rate values for Tð1 sÞ when f ¼ 1 s and the distribution of ON/OFFepochs is either uniform or exponential with mean of one second.

For the uniform distribution we proceed as follows. We havetwo cases, depending whether the random starting point is during

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

Table 2Relative error, in %, between the actual and the estimated mean transmission timesT10ðxÞ with f ¼ 1 s of smallish messages over ten links.

x [s] 1 2 4 8 12

Exp. 0.4 �3.3 �0.1 11.6 21.2Uni. 0.1 �3.8 �1.8 5.8 12.5

P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx 9

ON epoch or OFF epoch. Let us first consider starting during OFFepoch. The time remaining in the OFF epoch is in average half ofthe expected length of the starting epoch. According to formula(5.16) in [9] this expected remaining time can be calculated as:

E½Y �=2þ Var½Y�=2E½Y� ¼ 1=2þ ð1=3Þ=2 ¼ 2=3 ½s�:

Thus, we first ‘‘lose’’ 2=3 [s] after which we start transmission triesfrom beginning of next ON epoch. The chance of getting the frag-ment through during the first ON epoch is exactly 1=2; if this doesnot happen then the ON epoch was too short. Obviously the averagelength of the ‘‘lost’’ ON epoch is 1=2 [s], and after ’’losing’’ another(full) OFF epoch of mean length of 1 [s] we can try again from thesame situation of being at the start point of a new ON epoch.

These facts imply that the time T needed to get the messagethrough (starting from the beginning of the first ON epoch) satis-fies the following equation:

T ¼ ð1=2Þ � 1 ½s� þ ð1=2Þ � ð1=2 ½s� þ 1 ½s� þ TÞ:

The solution of this linear equation is T ¼ 5=2 s.Altogether expected time needed to get the message through is

now 2=3þ 5=2 ¼ 19=6 s.Let us look then on the other case where the starting point is

during an ON epoch. First we calculate the probability that thelength of the remaining time in the ON epoch is at least one second.For this purpose we may use the formula (5.10) of [9] that gives usthe probability density function of the remaining time asð1� FðyÞÞ=E½Y� where FðyÞ is the cumulative distribution functionfor the length of Y. In our uniform distribution case FðyÞ ¼ y=2,hence the wanted probability can be computed as:Z 2

11� y=2ð Þdy ¼ 1=4:

Under the assumption that the remaining length is less than onesecond (which thus happens with probability 3=4), we can calculatethe expected remaining time as follows:R 1

0 y 1� y=2ð Þdy3=4

¼ 1=2� 1=63=4

¼ 4=9 ½s�:

In other words, the expected ‘‘lost’’ length of the ON epoch is 4=9[s]. Next we ‘‘lose’’ the time for a full OFF epoch: in average one sec-ond, and we are again in the situation discussed before: timeneeded to get the message through from start of full ON epoch is5=2 s. Thus, the total time in this case can be calculated as

ð1=4Þ � 1þ ð3=4Þ � ð4=9þ 1þ 5=2Þ ¼ 77=24 ½s�:

Finally we can combine these two cases to obtainTð1 sÞ ¼ ð1=2Þð19=6þ 77=24Þ ¼ 153=48 s.

Similar steps are needed in calculation for exponential distribu-tion. The time needed from the start of a fresh new ON epoch hasbeen calculated in [7]: that is

T ¼ 2ðe� 1Þ ½s�:What we need still to calculate in the case of start during OFF epochis the remaining expected ‘‘lost’’ time in that OFF epoch. We mayuse the same formula as in the uniform case and we obtain

E½Y �=2þ Var½Y�=2E½Y� ¼ 1=2þ 1=2 ¼ 1 ½s�:

This result follows also from the fact that the exponential distribu-tion is (in a certain sense) ‘‘memoryless’’. Now the total time getsthe form 1þ 2ðe� 1Þ ¼ 2e� 1.

In the case where we start from the ON epoch we first need tocalculate the probability that the remaining time is at least onesecond. Following the same reasoning as in the uniform case, thiscan be calculated as:Z 1

11� 1� e�yð Þð Þdy ¼ 1=e:

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

Next we compute the expected length of the ‘‘lost’’ ON epochunder the assumption that the remaining time of the ON epochis less than one second:R 1

0 ye�ydy1� 1=e

¼ e� 2e� 1

:

Thus, the total time in the case of start during an ON epoch is:

1e� 1þ e� 1

e� e� 2

e� 1þ 1þ 2 e� 1ð Þ

� �¼ 2 e� 1ð Þ ½s�:

Again, this result could alternatively be derived from the fact thatthe exponential distribution is memoryless.

Combining finally the two cases, we obtain

Tð1 sÞ ¼ ð1=2Þð2e� 1Þ þ ð1=2Þð2e� 2Þ ¼ 2e� 3=2 ½s�:

Summarizing, for the exponential distribution (with mean ofone second) we have

Tð1sÞ � 3936 ½s�

and for the uniform distribution we have

Tð1sÞ � 3187 ½s�:

Let us then consider cases where f ¼ 1 s and 2 s < x 6 12 s. Weassume again that already at the sending node the message is splitinto fragments of size f ¼ 1 s each. The distance between secondfragment and third fragment does not fully follow the randomwalk model before the first fragment is out of the way. Indeed, be-cause the second fragment may be blocked from proceeding by thefirst fragment it is more likely that the distance decreases than thatit increases. However, this bias is removed after the first fragmenthas been transmitted over the last link and therefore we use thesame estimate as above for the distance between second and thirdfragments at the time point when the second fragment is transmit-ted over the last link. Hence, we have the estimate

Tnð3 sÞ � ðnþ 2ffiffiffiffiffiffiffiffiffiffiffiffi2n=p

pÞTð1 sÞ:

Mainly for the sake of keeping our approximation simple weuse the same reasoning, ignoring blocking of the fragment aheadby those ahead of it, for distances between subsequent fragments.In this way we get the estimate

TnðxÞ � ðnþ ðx� 1Þffiffiffiffiffiffiffiffiffiffiffiffi2n=p

pÞTð1 sÞ: ð12Þ

The numerical value of 1 s is actually not relevant here but weneed to assume that the block size f is close to the typical ONepoch: f � mean ON. If f is smaller, then several fragments arelikely to pass through a link when it’s ON and the random walkmodel would not anymore be appropriate. Thus, the formula canbe generalized for other values of f:

TnðxÞ � nþ xf

� 1

� � ffiffiffiffiffiffi2np

r !Tðf Þ ð13Þ

under the conditions that f �mean ON epoch and the number ofblocks dx=f e into which the source node quantizes the message isrelatively small, say upto ten.

We have applied these estimates to the cases where f ¼ 1 s andx ¼ 1, 2, 4, 8, and 12 s for both exponential and uniform distribu-tions and the results over ten links are summarized in Table 2

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

Fig. 8. Relative error % at distances from one to ten links from the source node. Thedisruptions are exponentially distributed, x ¼ 4 s and f ¼ 1 s. Black bars are theerrors of Eq. (12); gray bars are the errors of Eq. (6).

8 Another way of handling this issue would be to include additional x into (14) butthis would cause other complications because transmission of the message over thenext link may very well begin before transmission has ended over the previous link.

10 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

where we give the relative error in T10ðxÞ. The relative errors werecomputed as the difference between our new approximation forT10ðxÞ and the actual (simulated) value of T10ðxÞ divided by the ac-tual value of T10ðxÞ.

Comparing with our original approximation errors in Table 1 wesee that the new approximation performs very well for x 6 4 s andhas advantage over the original approximation in the case x ¼ 8 sbut not anymore in the case x ¼ 12 s.

In Fig. 8 we illustrate the relative error percentage at differentdistances from the sender when the disruptions are distributedexponentially, x ¼ 2 s and f ¼ 1 s. We show the errors of Eq. (12)and the errors of Eq. (6) in this case. Here, as well as in the othersmall message cases that we have looked into, the accuracy ofthe new approximation increases as we move farther from thesource node along the chain. In contrast, the accuracy of Eq. (6)in general decreases with distance from the source node.

It is quite natural that this new random walk model behavesbetter when the number of links increases because then the ‘‘ran-dom walk’’ becomes longer. It also follows that the new approxi-mation begins to outperform the original one when the numberof links n becomes bigger. For all cases with x 6 8 s this happensat latest when n > 5. (In the original approximation it is the in-creased time that makes the ‘‘random walk’’ longer in each individ-ual queue in the intermediate nodes but increasing number of linkshas only an indirect effect.)

The Eq. (12) cannot easily be generalized to smaller values of fbecause then we cannot assume that different fragments cannotpass through a link during the same ON epoch. This assumptionis crucial for the random walk model.

5.2. Small messages with many fragments

The approximation presented above in Section 5.1 is based onthe assumption that the number of fragments is relatively small.This is of course quite reasonable assumption when the messageitself is small. However, in order to complete our analysis, wenow study the case where: (i) the size of the message is small,say, it is at most equal to the mean contact time; (ii) number offragments is still big, i.e. f is much smaller than x.

Our construction leads to a lower bound approximation formulafor transmission times. This new lower bound improves over ourearlier trivial lower bound of Eq. (3). Since we are anyway aimingfor a lower bound, it makes sense to restrict our discussion to thecase where the fragment size is as small as possible. If f is actuallybigger, our approximation serves still as a lower bound.

The idea of this lower bound is the following. We try to find aformula for estimating TnðxÞ from below. When x is about the samesize as the mean length of an ON epoch, the probability that thewhole message can fit to the next ON epoch is fairly big. If this in-deed happens then we are able to reduce computation of transmis-sion time to (a little bit simpler case of) Tn�1ðxÞ.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

On the other hand, there is also a fairly big probability that thewhole message does not fit into the next ON epoch. If this case hap-pens, then the message is broken into two parts: the first partneeds to be transmitted over n� 1 links while the latter part re-mains in the source node (and still needs to transmitted over nlinks). Now we make a simplifying assumption that leads to a low-er bound approximation: we ignore the first part (in other words,we assume it manages to stay ahead of the remaining part allthe time) and compute only the transmission time for the remain-ing part. This simplification reduces the computation to Tnðx0Þwhere x0 < x.

We can combine the two cases into one formula by multiplyingeach case by its probability. This results in a recursive inequalityformula where TnðxÞ is expressed with the help of Tn�1ðxÞ andTnðx0Þ. Because there is a decrease in either number of links orthe size of the message, the recursion would ultimately come toan end, i.e. when we reach cases n ¼ 0 or x0 6 f .

It is possible to formulate the lower bound for the case of gen-eral distribution (of contact times) but we have our focus in thetwo specific distributions for which we have also experimentalsimulation data.

For the uniform distribution (with mean of one second), theprobability that a message of size x fits into a single ON epochcan be expressed as 1� x=2. If the starting time occurs in the mid-dle of an ON epoch, this probability is actually smaller but becausewe are aiming for a lower bound anyway, we may ignore this com-plication. Now the probability that x does not fit into the first ONepoch is naturally x=2. (Note that these discussions make senseonly if x 6 2; otherwise x can never fit into a single ON epoch.)

We still have to increase TnðxÞ by the expected length of an ini-tial OFF period. We can conclude, based on similar discussion thanin the beginning of this section, that the expected length of initialperiod is 2/3 s in case the start appears in the middle of an OFF per-iod. Because there is 50% chance for this, the expected length of ini-tial OFF period is 1/3 s.

If the message does not fit into an ON epoch then it breaks intotwo parts, and the breaking point could be any point in the mes-sage with equal probability (because we have uniform distribu-tion). This means that the length of the remaining part has amean of x=2. In addition we consume expected time of x=2 fortransmitting the first part of the message. After considering thiswe are not still quite in position where the remaining time wouldbe Tnðx0Þ (with x0 ¼ x=2) because we start counting from beginningof the next OFF epoch rather than from an arbitrary point. The ex-pected length of the initial OFF epoch is 1 s in the first case whereasit is 1/3 s in the latter case. Hence, we still have to add the differ-ence of 2/3 s. Now the recursive lower bound formula (for uniformdistribution) becomes

TnðxÞP13þ 1� x

2

� �Tn�1ðxÞ þ

x2� x

2þ 2

3þ Tn

x2

� �� �: ð14Þ

Let us now discuss the end points of the recursion. Let us beginwith the case n ¼ 0. It would appear natural to define T0ðxÞ ¼ 0 be-cause the message needs not to be transmitted over any links. How-ever, the logic of our lower bound formula is not compliant withthis thinking. For a closer look, please apply n ¼ 1 in the Eq. (14).Now the term ð1� x=2ÞT0ðxÞ on the right-hand side correspondsto the case where whole message is transmitted over the (single)link during the first ON epoch. Here ð1� x=2Þ is the probability ofthe event while T0ðxÞ corresponds to the transmission time in thiscase. But the latter is of course not zero but instead it is equal tox. Therefore, we should have the technical agreement8 that

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx 11

T0ðxÞ ¼ x:

The other end point is the case x 6 f . Here we obtain a lowerbound by the following simple reasoning. For each link, there ispotentially an initial OFF period whose expected length is 1/3 s.These OFF periods for different links cannot overlap in time be-cause (first bit of) the message has to be transmitted in between.Now we can conclude that Tnðf ÞP n=3. After the (potential) initialOFF period for the last link, we still need (at least) the time of f fortransmission of the message over the last link. Therefore, we canslightly improve the recursion end point into

Tnðf ÞP n=3þ f :

This ends discussion of the uniform distribution.The case of exponential distribution is handled in a similar way.

Now the probability that x fits into a single ON epoch is e�x and un-der the condition that this does not happen, the expected length ofthe first part (that gets through) is

hðxÞ ¼ ð1� e�xðxþ 1ÞÞ=ð1� e�xÞ:

(Elementary calculations, similar to the ones presented earlier, areneeded for achieving this formula.) Exponential distribution ismemoryless, therefore expected length of the initial OFF epoch is1/2 s. Now the lower bound formula for exponential distributionbecomes

TnðxÞP12þ e�xTn�1ðxÞ þ ð1

� e�xÞ hðxÞ þ 12þ Tnðx� hðxÞÞ

� �: ð15Þ

End points for the recursion are similar as in the uniform case:

T0ðxÞ ¼ x; Tnðf ÞPn2þ f :

Please note that the recursion indeed does converge (almost) as fasttowards Tnðf Þ as in the uniform case because for small values of xwe have: hðxÞ � x=2.

We compared the lower bounds obtained from the above Eqs.(14) and (15) to the transmission times from the simulation data.For the cases x ¼ 1 s and x ¼ 0:5 s, both combined withf ¼ 10�3 s, our lower bounds give estimations that all are at most18% below the actual figures for transmission times over five links.Similarly, the lower bounds are all at most 23% below actual fig-ures for transmission times over ten links. There are no big differ-ences between uniform and exponential distributions in thisaspect. In order to get some contrast it is worth mentioning that,for the trivial lower bound of Eq. (3), the corresponding figuresare 71% and 84%.

6. Discussion of the results

Two types of results characterizing the mean transmission timeTnðxÞ of fragmented message over a chain of disruptive links wereobtained in this paper. The first includes the simple upper and low-er bounds in inequalities (1)–(3), and the more refined upperbound (7). The second are the two approximations for estimatingTnðxÞ in Eqs. (6) (Section 3), and (13) (Section 5). In deriving theseapproximations we have used the concept of random walk. Thesending node needs to know the distribution of the time betweenlink disruptions to estimate the mean transmission time with thesemethods. Assuming that knowledge in the sender is reasonable inthe case of homogeneous chain that we address in this paper be-cause the distribution of the time between disruptions in the firstlink of the chain can be directly estimated by the sender from accu-mulated past observations, and the distributions in the followinglinks are the same as in the first.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

For discussing how these approximations apply, we will looselyclassify the magnitude of message sizes into ‘‘big’’, ‘‘small’’ and‘‘tiny’’, based on what can be typically transmitted within a singlecontact time. Our fundamental unit of length is the mean ON dura-tion: big message size can be defined as roughly ten times or morethan the mean ON duration; small message is roughly of the sameorder of magnitude as the mean ON duration; tiny messages aresignificantly smaller than the mean ON duration.

Another quantity relevant to applying our approximations is thenumber of blocks of size f into which the source node quantizes themessage before sending, i.e. dx=f e.

As explained earlier, it can be argued that tiny messages neednot be fragmented; therefore we have the transmission time esti-mate for them: TnðxÞ � nTðxÞ.

We will outline next how our approximations cover the remain-ing combinations of message size and the number of blocks classes.The cases of big message quantized into big number of blocks (f issmall compared to x); and big message quantized into small num-ber of blocks are covered by the approximation in Eq. (6). (The lat-ter case is likely to result in transmission failures.) The case ofsmall message (x is of the same order of magnitude as the meanON) quantized into small number of blocks is covered by ourapproximation in Eq. (13).

Neither of these two approximations apply very well for theremaining, last case of small messages quantized into large num-ber of blocks. Still, the transmission time in this case may beroughly estimated from above by the transmission time in the caseof small message of same size (headers included) quantized intosmall number of blocks. This upper bound is given by Eq. (13) inthis paper. On the other hand, we derived in Section 5.2 recursivelower bound formulas (14) and (15) (for uniform and exponentialdistributions).

For reference, we summarize methods for estimating the meantransmission time over a chain of disrupted links in Table 3. Theleftmost column in that summary table contains, when appropri-ate, the number of a section in which the corresponding expressionhas been introduced and the equation number.

When using the table, it is good to remember our discussion inSection 3 about time units. If the mean contact time is somethingelse than one second, then that something else could be used as theunit of time in all formulas. But then it is important to notice thatalso message length and the fragmentation unit length would bemeasured by that unit of time.

7. Conclusion and future work

We have shown how to estimate mean transmission times offragmented message over a chain of disrupted links, where the linkdisruptions are identically and independently distributed. We hopethat this will be a stepping stone for more complex models thatcapture even more aspects of DTNs.

But already from our simple model we can learn somethingabout fragmented message transmission in more complex scenar-ios. A natural unit for fragmentation questions is the mean contactduration: a message is ‘‘small’’ or ‘‘large’’ if it can be transmitted ina typical contact or not. Making the fragmentation unit size f biggerthan the typical contact time should be avoided in practice, be-cause it leads to transmission bottlenecks. Our simulations and cal-culations confirm that the dynamics of fragmented messagetransmission for small and large messages are substantially differ-ent. This is why small and large messages seem to need differentways to estimate their transmission time over a chain of disruptedlinks.

We have derived our approximation formulas for general classof disruptions and validated them by simulating the cases ofuniform and exponential distributions. From those simulations

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

Table 3Summary of results on mean transmission time for a chain of disrupted links.

12 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

we conclude that our approximation is suitable for large messagesizes, that are at least a few times bigger than what can be typicallytransmitted within a single contact time; its accuracy increaseswith the message size, and decreases with the distance from thesource node. We also created alternative approximation formulasfor the cases where the message is small and contains only fewfragments.

Future work topics on the multi-link chain scenario includeestimating the variance of the mean transmission time, verifyingour approximation with additional distributions of contact and in-ter-contact times, and analyzing the case where statistics of thosetimes for different links are not the same. Based upon those, ourtarget is (as mentioned in the introduction) investigating frag-mented message transmission over multiple paths in mobileopportunistic networks.

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

Acknowledgment

This work was supported by TEKES as part of the Future Inter-net program of TIVIT (Finnish Strategic Centre for Science, Technol-ogy and Innovation in the field of ICT), and by the EU SCAMPIproject IST-2001-32404.

Appendix A. Justification for the inequality (7)

Lemma. Let us use the notations of Sections 2 and 3. Suppose TðxÞ is alinear increasing function and Qn�1ðtÞ is a concave function. Then theinequality (7) holds, i.e.

TnðxÞ 6 Tn�1ðxÞ þ TðQ n�1ðTn�1ðxÞÞÞ:

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

Table A.4Contribution to the derivative of Q kðtÞ by qk�1ðt0Þ and qkðt0Þ combinations.

Case # qk�1ðt0Þ qkðt0Þ Contribution to Q 0kðtÞ

1. >0 >0 02. 0 0 03. 0 >0 <04. >0 0 >0

P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx 13

Proof of the lemma. Let us set k ¼ n in Eq. (4) and replace thequeue size qn�1ðx; tn�1ðxÞÞ at the time tn�1ðxÞwhen the transmissionof x over the penultimate ðn� 1Þst link ends, with the potentiallylarger size of saturated-input queue qn�1ðtn�1ðxÞÞ at that same time.This creates the inequality

tnðxÞ 6 tn�1ðxÞ þ tðqn�1ðtn�1ðxÞÞÞ:The total transmission time tnðxÞ of the message x over n links

on the left hand side of the inequality is conditioned on three dif-ferent random variables: tn�1ðxÞ; qn�1ðtn�1ðxÞÞ, and tðqn�1ðtn�1ðxÞÞÞ;and we write these dependencies into the left hand side:

tnðxjtn�1ðxÞ; qn�1ðtn�1ðxÞ; tðqn�1ðtn�1ðxÞÞÞÞ6 tn�1ðxÞ þ tðqn�1ðtn�1ðxÞÞÞ:

To get rid of conditioning we will apply the expectation opera-tion to the inequality three times in a row. In step 1 we will aver-age over the upper bound on the random transmission timethrough the last link tðqn�1ðtn�1ðxÞÞÞ; the values of tn�1ðxÞ andqn�1ðtn�1ðxÞÞ are considered fixed (constant) in that step. In step 2we will average the resulting inequality over the upper boundqn�1ðtn�1ðxÞÞ on the random size of the data remaining in theðk� 1Þth node at the time tn�1ðxÞ; the value of tn�1ðxÞ is consideredfixed in that step. In the final step 3 we will average over the ran-dom transmission time tn�1ðxÞ of the message x over the firstðn� 1Þ links. These steps that we will now take, result in inequality(7).

Step 1: Taking the expected value of the conditioned totaltransmission time does not affect the fixed tn�1ðxÞ and qn�1ðtn�1ðxÞÞon the right hand side of the above inequality:

Tnðxjtn�1ðxÞ; qn�1ðtn�1ðxÞÞÞ 6 tn�1ðxÞ þ Tðqn�1ðtn�1ðxÞÞÞ:

Step 2: To remove the dependency on qn�1ðtn�1ðxÞÞ we take theexpected value from both sides of the last inequality:

E½Tnðxjtn�1ðxÞ; qn�1ðtn�1ðxÞÞÞ� 6 E½tn�1ðxÞ þ Tðqn�1ðtn�1ðxÞÞÞ�:

We can move the expectation operator inside the second term onthe right hand side due to our premise that TðxÞ is a linear function:

Tnðxjtn�1ðxÞÞ 6 tn�1ðxÞ þ E½Tðqn�1ðtn�1ðxÞÞÞ�¼ tn�1ðxÞ þ TðE½qn�1ðtn�1ðxÞÞ�Þ¼ tn�1ðxÞ þ TðQ n�1ðtn�1ðxÞÞÞ:

Step 3: The mean transmission time on the left is still condi-tioned on the random quantity tn�1ðxÞ. We therefore apply theexpectation operator one more time to both sides, and again movethe expectation operation from the outside of function Tð�Þ intothat function’s argument, while keeping the inequality signbetween the left and the right hand sides. This is justified by ourassumption that the transmission time TðxÞ over a single link is alinear, increasing function:

TnðxÞ ¼ E½Tnðxjtn�1ðxÞÞ� 6 Tn�1ðxÞ þ E½TðQ n�1ðtn�1ðxÞÞÞ�¼ Tn�1ðxÞ þ TðE½Q n�1ðtn�1ðxÞÞ�Þ:

We will next move the expectation operation from the outsideof the function Q n�1ð�Þ into that function’s argument. We haveassumed that Q n�1ð�Þ is a concave function. This property and Jen-sen’s inequality imply that

E½Q n�1ðtn�1ðxÞÞ� 6 Qn�1ðE½tn�1ðxÞ�Þ ¼ Qn�1ðTn�1ðxÞÞ:

Now the above inequality and our assumption that TðxÞ is anincreasing function imply, in turn, that

TnðxÞ 6 Tn�1ðxÞ þ TðE½ðQn�1ðtn�1ðxÞÞ�Þ 6 Tn�1ðxÞ þ TðQ n�1ðTn�1ðxÞÞÞ;

which is the claimed inequality (7). This completes the proof of thelemma. h

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

Justifications for assumptions of the lemma.We have already earlier given justification for the assumption

that TðxÞ is a linear increasing function. What is still needed is ajustification for the other assumption of the lemma.

More specifically, we will now explain why Qn�1ðtÞ is typicallyconcave after some initial time period. Our argument is based onthe homogeneity of the chain of links.

Let us begin by showing that Qk�1ðtÞ > Q kðtÞ for all t and all k.This follows from the fact that there is one-to-one mapping be-tween different ON–OFF patterns in the first k links and the k linksstarting from the second link. If we mark by qk�1ðtÞ the (final)queue size in the first case and by ~qk�1ðtÞ the (final) queue size inthe second case, then the only difference between these two valuesstems from the size of the queue in the source node: in the firstcase the source node always has a queue (remember that we as-sume here that the message size x is bigger than any t we are inter-ested in) while in the second case there is a chance that the queuein the source node is empty. Now it follows that ~qk�1ðtÞ 6 qk�1ðtÞ.We also know that for certain ON–OFF patterns the source nodein the second case indeed has empty queue sometimes and the dif-ference carries over to the final node, implying there are caseswhere the strict inequality ~qk�1ðtÞ < qk�1ðtÞ holds and it followsthat

E½~qk�1ðtÞ� < E½qk�1ðtÞ� ¼ Q k�1ðtÞ:

On the other hand, E½~qk�1ðtÞ� ¼ QkðtÞ and our first claim follows.Let us denote with ukðtÞ the probability that the kth queue is

empty at time t:

ukðtÞ ¼ PrðqkðtÞ ¼ 0Þ:

With the same argumentation as above we can show thatuk�1ðtÞ < ukðtÞ (for all t and all k). Indeed, in the setting above,qk�1ðtÞ ¼ 0 implies ~qk�1ðtÞ ¼ 0 but not always vice versa.

Next step is to show that QkðtÞ is increasing function of t (for allk).

If t2 > t1 then the distribution of ON–OFF patterns are identicalduring the two time intervals ½0; t1� and ½t2 � t1; t2�. Hence, there isa one-to-one mapping from each individual ON–OFF pattern lead-ing to qkðt1Þ to another ON–OFF pattern that leads to ~qkðt2Þ wherethe starting point in time is t2 � t1 (instead of 0) in the latter case.Now all differences between qkðt1Þ and ~qkðt2Þ stem from the factthat the former case begins with empty queues in every node (ex-cept the source node) while in the latter case some of the queuesmay be non-empty at time point t2 � t1. Because making somequeue longer cannot make any other queue shorter in the future,it follows that qkðt1Þ 6 ~qkðt2Þ. We also know that there must becases where the strict inequality holds. Therefore, we have alsostrict inequality for the expected values: Q kðt1Þ < Qkðt2Þ.

Again, as a counterpart for the previous result we can show thatukðtÞ is decreasing (as a function of t). This follows from the factthat in the setting of the previous paragraph, ~qkðt2Þ ¼ 0 impliesqkðt1Þ ¼ 0 but not always vice versa.

Let us still take another point of view on the increasing natureof QkðtÞ. This can be studied by looking at individual qkðtÞ casesat an arbitrary time point t ¼ t0. (Cf. Table A.4.)

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/

14 P. Ginzboorg et al. / Computer Communications xxx (2014) xxx–xxx

(1) The first case (which is also the most common case) is suchthat qkðt0Þ > 0; qk�1ðt0Þ > 0 (and we assume there is nochange in states of neither preceding nor succeeding linkat exactly time t0). We see that qkðtÞ is increasing if the jointstate is ON/OFF and decreasing if the joint state is OFF/ON. Inaddition, if the joint state is either ON/ON or OFF/OFF, thenqkðtÞ is constant. Because all of these joint states are equallyprobable, all these cases cancel each other out when consid-ering Q kðtÞ, and Q kðtÞ is constant at t0 if only these caseswould be counted for the expected value.

(2) If qkðt0Þ ¼ 0 ¼ qk�1ðt0Þthen the queue cannot either increaseor decrease, hence these cases contribute for constant por-tion in QkðtÞ.

(3) Next we consider the cases where qkðt0Þ > 0 but qk�1ðt0Þ ¼ 0.Now it is clear that qkðtÞ is either decreasing or constant. Thisdepends on whether the succeeding link is ON or OFF. Bothcases are equally probable and these cases altogether con-tribute a decreasing portion to Q kðtÞ.

(4) Finally we go for cases where qkðt0Þ ¼ 0 but qk�1ðt0Þ > 0.Now it is clear that qkðtÞ is either increasing or constant. Thisdepends on whether the preceding link is ON or OFF. Bothcases are equally probable and these cases altogether con-tribute an increasing portion to Q kðtÞ. (Note that we assumehere that a queue in a node is not considered empty if boththe preceding link is ON and the queue in the precedingnode is not empty, i.e if there is flow of input in the nodethen the queue consists of at least one bit even if the nextlink is ON also.)

Because we showed above that uk�1ðtÞ < ukðtÞ it follows that theincreasing portion is bigger than the decreasing portion and, in to-tal, Q kðtÞ is increasing (as a function of t).

We also see that the rate of increase (i.e. the derivative of Q kðtÞ)is proportional to the difference ukðtÞ � uk�1ðtÞ. Hence, what stillneeds to be shown is that this difference is getting smaller whent grows (in addition to the value ukðtÞ itself being decreasing).

For the case k ¼ 1 the difference ukðtÞ � uk�1ðtÞ is indeeddecreasing because u0ðtÞ ¼ 0 (as we assume that the source nodenever becomes empty). Now we have shown that Q1ðtÞ is concave.

On the other hand, the difference between the growth rate ofQ2ðtÞ and that of Q1ðtÞ stems from the cases where q1ðtÞ ¼ 0.In other words, the derivative of Q 2ðtÞ is proportional to thederivative of Q 1ðtÞ multiplied by 1� u1ðtÞ. But we saw above thatthe derivative of Q 1ðtÞ is proportional to u1ðtÞ, hence the derivative

Please cite this article in press as: P. Ginzboorg et al., Message fragmentation f10.1016/j.comcom.2014.03.015

of Q 2ðtÞ is proportional to ð1�u1ðtÞÞu1ðtÞ¼u1ðtÞ�u1ðtÞ2. Becauseu1ðtÞ is a decreasing function, its derivative is negative. Nowthe second derivative of Q 2ðtÞ is proportional to the deriva-tive of u1ðtÞ � u1ðtÞ2 which is equal to u01ðtÞ � 2u1ðtÞu01ðtÞ ¼u01ðtÞð1� 2u1ðtÞÞ. This value is negative whenever u1ðtÞ < 1=2. Fortypical ON and OFF epoch distributions that may be useful in prac-tice this would be the case after some initial time interval, andtherefore our argument shows that for all these distributionsQ2ðtÞ is concave after some initial time period when u1ðtÞP 1=2,i.e. when it is more probable that the queue in the first link isempty than that the queue is not empty.

Similarly, we can derive that the derivative of QkðtÞ is propor-tional to the derivative of Q1ðtÞmultiplied by 1� uk�1ðtÞ and it fol-lows that QkðtÞ is proportional to ð1� uk�1ðtÞÞu1ðtÞ. Then thesecond derivative of QkðtÞ is proportional to

�u0k�1ðtÞu1ðtÞ þ ð1� uk�1ðtÞÞu01ðtÞ ¼ u01ðtÞ � u0k�1ðtÞu1ðtÞ� uk�1ðtÞu01ðtÞ:

Again, for typical ON/OFF distributions that may be useful in prac-tice it is the case that after some initial period both u1ðtÞ anduk�1ðtÞ become small and the first term becomes dominant. There-fore, also QkðtÞ is concave after some initial time period.

References

[1] K. Fall, A delay-tolerant network architecture for challenged internets, in:SIGCOMM ’03, ACM, New York, NY, USA, 2003, pp. 27–34.

[2] K. Scott, S. Burleigh, Bundle Protocol Specification, RFC 5050, 2007.[3] M. Pitkänen, A. Keränen, J. Ott, Message fragmentation in opportunistic DTNs,

in: Proceedings of the Second WoWMoM Workshop on Autonomic andOpportunistic Communications (AOC), IEEE, 2008.

[4] J. McCann, S. Deering, J. Mogul, Path MTU discovery for IP version 6, RFC 1981,1996.

[5] P.R. Jelenkovic, J. Tan, Dynamic packet fragmentation for wireless channelswith failures, in: Proceedings of MobiHoc’08, ACM, Hong Kong, China, 2008,pp. 73–82.

[6] J. Nair, M. Andreasson, L. Andrew, S. Low, J. Doyle, File fragmentation over anunreliable channel, in: Proceedings of INFOCOM 2010, IEEE, San Diego, CA,USA, 2010.

[7] P. Ginzboorg, V. Niemi, J. Ott, Message fragmentation for disrupted links, in:Proceedings of European Wireless 2011, VDE VERLAG GMBH, Vienna, Austria,2011, pp. 415–424.

[8] P. Ginzboorg, V. Niemi, J. Ott, Message fragmentation algorithms for disruptedlinks, Comput. Commun. 36 (2013) 279–290, http://dx.doi.org/10.1016/j.comcom.2012.10.001.

[9] L. Kleinrock, Queueing Systems, Theory, vol. I, Wiley Interscience, 1975.[10] N. Prabhu, Stochastic Storage Processes: Queues, Insurance Risk, Dams, and

Data Communication, vol. 15, Springer Verlag, 1998.[11] A. Rényi, Foundations of Probability, Dover, 1998.

or a chain of disrupted links, Comput. Commun. (2014), http://dx.doi.org/