Upload
shirish
View
220
Download
1
Embed Size (px)
Citation preview
Retargeting LT Codes using XORs at the Relay
Shirish Karande
Algorithms and Optimization Group, Systems Research Lab,
Tata Research Development and Design Center
Hadapsar, Pune – 411013, India
Abstract— We consider a network where multiple sources
communicate via a single relay to sinks with non-uniform and unequal demands. The LT distributions employed at the sources
may be ill-matched to the demands of the sinks connected to the relay. We consider a probabilistic "morphing" of two or more
fountain-encoded streams into a single stream better suited for the demand patterns downstream of the relay. The relay observes
symbols generated from two distinct fountain codes, and can decide to forward a symbol from one source, or the other source,
or the X-OR of the two symbols. We propose a linear
programming based design of Generalized LT codes, which with appropriate substitutions is utilized to design the probabilities for
the relay. Simulation results show that the designs obtained on the basis of the proposed optimization problem, when compared with multiplexing or simple mixing, reduce the number of
symbols that have to be downloaded to guarantee a desired probability of successful decoding.
I. INTRODUCTION
Consider a network, such as the one shown in Figure 1, where the sinks asynchronously download multiple files from distinct sources via a common relay. It has been shown (e.g.[1]) that in a rateless download of multiple files, network coding [2] can provide delay and throughput gains. It can be verified that in the Y-network shown in Figure 1, random network coding (RNC) at the relay, reduces the delay in complete download of the source files. However, in general, the decoding complexity associated with RNC is significantly high. Hence, one seeks low-complexity alternatives that can facilitate download of large files.
LT Codes [3] are low-complexity capacity approaching rateless codes ([4]-[5]), which have found utility for asynchronous broadcast (e.g. [6]) over channels with heterogeneous and unknown behavior. LT encoding lends itself to distributed implementation, and hence can be used to improve the communication efficiency, e.g. by enabling parallel downloads from multiple mirror sites. Nevertheless, in networks with more than one-hop, LT coding cannot be easily combined with other modern information theoretic techniques such as network coding [2]. The decoding efficiency and time complexity of an LT code are sensitive to the degree distribution, and hence an intermediate relay cannot naively employ an operation which significantly perturbs the effective distribution.
Motivated by the above observations, in [7] Puducheri et. al. investigated the problem of constructing a Distributed LT code (DLT). Specifically, they consider a network where
multiple sources communicate to the sinks via a single relay. Traditionally, the operation at the relay would be limited to forwarding the LT encoded symbols received from the sources. However, selective XOR-ing at the relay can create a statistical effect of constructing an LT code on the concatenation of all source files. Such a DLT code has been shown to reduce the total download time [7].
The design methods proposed in [7] have been tailored for the Robust Soliton Distribution (RSD) [3] and are limited to the case where all sources have equal priority and all the sinks have an equal demand. In this work we consider a significantly generic scenario. We highlight that, while the degree distributions suggested by Luby [3] were optimized for complete recovery of the message word, in recent years, multimedia and sensor network applications have motivated several researchers to design LT distributions that enable partial and prioritized data recovery (e.g. [8]-[16]).
In the network analyzed in this work, we assume that the LT degree distributions at the sources are optimized for the non-uniform
1 demands of the “local” sinks. Nevertheless,
these sources are further subscribed to by sinks connected to a
1 Nonuniform demand implies that different sets of sinks are interested in
downloading distinct fractions of the source file. The downloaded fraction
may determine the quality of media. Consequently, the non-uniformity can be
seen to be emerging from a rate-distortion tradeoff.
Figure 1 Network consisting of two sources [1,2] and a common relay. Each source employs an LT encoded fountain to meet the local non-uniform demands (e.g. half of the local sinks connected to source 1 desire to recover only 60% of the file, while the other half desires to recover 80% ). An additional set of sinks with unequal and non-uniform demands subscribe to the source via the common relay.
2012 IEEE Information Theory Workshop
978-1-4673-0223-4/12/$31.00 ©2012 IEEE 597
common relay. The demands of the sinks connected to the relay can be non-uniform as well as unequal
2. Furthermore, the
non-uniform demands of the local sinks and those connected to the relay may be differing, thus creating a mismatch. We seek to design a selective mixing strategy at the relay which has the effect of retargeting the effective degree distribution.
The output of the relay can be modeled by a Generalized LT (GLT) distribution, parameterized by the mixing probabilities at the relay. GLT codes use vector degrees to explicitly indicate the sampling of input symbols from distinct data segments. The use of ripple based analysis of GLT codes [14] is the key to our design strategy. We propose a linear program (LP) based design of GLT codes for non-uniform and unequal demands. With suitable substitutions, the LP can be extended to design the mixing probabilities for the relay.
In [16] , Talari et. al. have also considered the design of DLT codes capable of providing prioritized recovery. They employ AND-OR tree analysis for the design. Nevertheless, they require a limiting relaxation that the probability of combining two encoded symbols at the relay is independent of the degree of the received encoding symbol. We show that such a relaxation affects the efficiency of combining LT codes.
The remainder of the paper is organized as: Section II provides a primer on GLT codes and establishes the network model. Section III, describes a LP based design of GLT codes. Section IV describes the design of mixing probabilities. Section IV reports the numerical results.
II. PRELIMINARIES
A. Generalized LT Codes
The encoding/decoding of GLT codes is similar to the
traditional LT codes. Consquently, we only provide a brief
description here, and refer the reader to [3], [14] for details.
Suppose we want to transmit a message comprising of k
input symbols, partitioned into m segments, such that,
[ ]1,i m∀ ∈ segment-i contains i i
k kα= symbols, than the
GLT code is defined by a generalized degree polynomial
( ) ( )( )
1
1
1
1 1, ,, ,
, , m
m
m
dd
m md d
d d
x x x xβ β= ∑ ⋯
⋯
⋯ ⋯ (1)
where ( )1 , , md dβ
⋯
represents the probability of choosing a vector
degree [ ]1, , md d d= ⋯ . The encoding symbol is generated by
XOR-ing i
d randomly chosen input symbols from segment-i.
Prior to decoding, a receiver downloads sufficient number
of encoding symbols, denoted here by kγ , to permit the
recovery of i iz k input symbols from each segment-i. The
collection of all encoding symbols with a single undecoded neighbor are said to form a ripple. Since each symbol in the ripple is connected to a single input symbol, the ripple can be
2 The term unequal demand describes a scenario where a sink desires to
download distinct proportions of different source files.
partitioned into mulitple colored sub-ripples by associating a color i with each data segment-i. Similar to traditional LT decoding, in each step, the decoder utilizes an encoding symbol from any of the colored ripples to decode an input symbol, thus making other input symbols decodable. The decoding stops either when the desired fractions are recovered or when all the colored ripples become empty.
B. Network Model
We employ a network model similar to the one used in [7].
As shown in Figure 2, we consider a network where two
disjoint sources communicate to the sinks via a common relay.
We assume that the communication takes place in terms of
epochs. Each epoch consists of an LT encoded transmission
from each source to the relay A, followed by a single broadcast
from the relay. The coding at the sources is determined by the
LT distributions
( ) ( )
1
1 1 11,1
k
d
d
d
x xβ β=
=∑ , ( ) ( )
2
2 2 22,1
k
d
d
d
x xβ β=
=∑ .
Meanwhile, the relay is assumed to have storage capacity of
one symbol per source and processing capability that is limited
to simple XORs. At each epoch the relay receives a symbol
from the individual sources. The relay probabilistically chooses
to either forward one of the received symbols or to transmit an
XOR-ed combination of the received symbols. Upon
transmission, the relay discards the received symbols. The
probabilistic decisions taken by the relay can be described by
“mixing rules” ( )( )
( )( )
( )( ){ }
1 2 1 2 1 2
1 2 1,2
, , ,, ,
d d d d d dΛ = Λ Λ Λ , where each rule
represents a conditional probability distribution:
Mixing Rule Λ : Given degrees 1
d , 2
d of encoding symbols
transmitted by source 1 and 2 resp.,
( )( )
( )( )
( )( )
1 2
1 2
1 2
1
,
2
,
1,2
,
1
2
-
d d
d d
d d
probability of forwarding symbol from source
probability of forwarding symbol from source
probability of transmitting an X OR
Λ =
Λ =
Λ =
where ( )( )
( )( )
( )( )
1 2 1 2 1 2
1 1,2 2
, , ,1 , , 0
d d d d d d≥ Λ Λ Λ ≥
( )( )
( )( )
( )( )
1 2 1 2 1 2
1 1,2 2
, , ,1
d d d d d dand Λ + Λ + Λ = (2)
A
1
2
T
1xβ
2xβ
1 2d dΛ
1
1
k
⋮
1
1 2
1k
k k k
+
= +
⋮
Figure 2 Network consisting of two sources [1,2] and a relay A. Network
operation is governed by1x
β , 2x
β and 1 2,d d
Λ .
2012 IEEE Information Theory Workshop
598
We say that the relay employs a multiplexing policy if
( )( )
( )( )
1 2 1 2
1 2
, ,0.5
d d d dΛ = Λ = and
( )( )
1 2
1
,0
d dΛ = . The analysis in [16]
requires that mixing probabilities be independent of the degrees
of encoded symbols received at the relay,
i.e.( )( )
( )( )
1 2 1 2
1 2
1 2, ,,
d d d dp pΛ = Λ = and
( )( )
1 2
1,2
3 1 2,1
d dp p pΛ = = − − .
We refer to such a constrained strategy as “simple mixing”.
Furthermore, we assume that the data broadcasted by the
relay is subscribed to by m sets of sinks 1 m
T T⋯ . Each set of
sinks may contain an arbitrarily large number of sinks. The set
jT has a demand described by the pair ( )1, 2,,j jz z , which
implies that a sink in set j
T desires to recover ij i
z k message
symbols from source i. Let j
γ be a parameter associated with
each sink set, such that, when a sink receives jkγ symbols
from the relay it is successfully able to recover the desired
fraction of message symbols. The design problem explored in
the paper can thus be stated as:
Problem 1 (P1):
*
1 1 2 2
arg min maxj
jj j
k
z k z k
γ
Λ
Λ = +
given ( ) ( ){ }1 1 2 2,x xβ β
P1 states that for the given source degree distributions, we
seek to find the mixing probabilities that minimize the worst
ratio, of the number of encoding symbols received by a sink to the number of message symbols recovered.
III. DESIGN OF GENERALIZED LT CODES
We design GLT codes using the following fluid approximations [14] for the size of the colored ripples.
( ) ( ) ( ) ( ) ( )1,0
1 1 2 1 1 2 1
1
, 1 , log 1r t t t t t tγ
βα
= − + −
(3)
( ) ( ) ( ) ( ) ( )0,1
2 1 2 2 1 2 2
2
, 1 , log 1r t t t t t tγ
βα
= − + −
(4)
In the above equations, ( )1 2,i ik r t t gives size of the ith ripple
when i i
k t input symbols have been decoded from the ith
source.
It is worth noting that the symbols from a particular segment can be decoded iff the corresponding ripple is non-empty. For
example, a decoder can transit from state ( )1 2,t t to
( )( )1 1 21 ,t k t+ iff ripple-1 is not empty. Therefore, it is
possible to recover a demand ( )1, 2,,i iz z if and only if there
exists a sequence of feasible transitions (i.e. a feasible path
[14]) from ( )0, 0 to ( )1, 2,,i iz z . Analyzing all the possible paths
increases the complexity of the design step. Hence, as shown in Figure 3 we employ a relaxation and seek degree distributions
which would make the straight segment from
( )0, 0 to ( )1, 2,,
i iz z a feasible path.
The problem of designing a GLT distribution for a single
demand ( )1 2,z z can be stated as:
Problem 2 (P2): ( ) ( )* *
,
, arg maxβ
β∆
∆ = ∆
subject to 1
[0, )t z∀ ∈
( )
( )( )
( )1,0 2 1
1 1 1 2 2 1
, log 1 01
z tt t t
z z z k t
αβ
α α
∆+ − − ≥ + −
( )
( ) ( )0,1 2 2 2
1 1 1 2 2 1 2
, log 1 01
z z tt t t
z z z z k t
αβ
α α
∆+ − − ≥ + −
with ( ) ( ) ( ) [ ]1 2 1 21 2, ,
1, , 0,1d d d d
d dβ β= ∀ ∈∑ and 0∆ ≥ .
The constraints in the above LP are obtained by rearranging the terms in Eq. (3) and (4). Note that, we have employed a
substitution ( )( )1 1 2 2z zγ α α= + ∆ and heuristic that
requires ( ) ( )( )1 2, 1j j j
r t t t t k≥ − . This heuristic, motivated
by the analysis in [8], employs a lower bound on the ripple-size in order to account for the variance.
In order to extend the above LP to the case of multiple non-uniform demands, we highlight that for a given degree distribution, a non-uniform demand can be feasibly supported
if the paths ( )0, 0 to ( )1, 2,,i iz z are all feasible with their
respective parameter i
γ . The LP for the single demand case
can thus be extended to multiple non-uniform demands with
the substitution ( )( )1 1, 2 2,i i iz zγ α α= + ∆ .
Figure 3 Relaxation checks feasibility only along the straight segment.
2012 IEEE Information Theory Workshop
599
Problem 3 (P3): ( ) ( )* *
,
, arg maxβ
β∆
∆ = ∆
subject to 1,
, [0, )i
i t z∀ ∀ ∈
( )
( )( )1,0 2, 1
1
1, 1 1, 2 2,
, , 0i
i i i
zt t k t
z z z
αβ µ
α α
∆+ ≥ +
( )
( )0,1 2, 2,2
2
1, 1,1 1, 2 2,
, , 0i i
i ii i
z zt t k t
z zz z
αβ µ
α α
∆+ ≥ +
with ( ) ( ) ( ) [ ]1 2 1 2
1 2, ,1, , 0,1
d d d dd dβ β= ∀ ∈∑ and 0∆ ≥ .
Where we define ( ) ( ) ( )( )( ), log 1 1x t t t x tµ = − − − .
A. The generalized-degree terms
The generalized-degree terms in the GLT distributions, for
example the terms such as 1 2
0.2x x and 2 3
1 20.2x x in the
distribution: ( ) 2 3
1 2 1 2 1 2 1 2, 0.4 0.2 0.2 0.2x x x x x x x xβ = + + + , lead
to the creation of encoding symbol formed by XORing input
symbols from different segments. In the absence of these terms, the generalized distributions can be viewed as a simple
multiplexing of two conventional LT distributions operating
on distinct message segments. Hence it is appropriate to make
the inquiry that when is mixing the most advantageous, i.e. if
one were to restrict the design of generalized degrees to
distributions without any generaliuzed-degree term, when is
the performance loss greatest? In order to answer this question
we solved P2 for various demands at 1 2 0.5α α= =
and
1 2,k k → ∞ . It was observed that, in the asymptotic regime,
when the demand is uniform, mixing provides no advantage.
However, in case of un-equal demand mixing can provide
gains even in the asymptotic regime. The results of our
enumeration are shown in Figure 4. It can be observed that the
absence of mixing can increase overhead by upto 12%.
IV. DESIGN OF MIXING PROBABILITIES
In order to design the mixing rules observe that if the
sources employ degree distributions ( )1 1xβ , ( )2 2
xβ and the
relay employs the rules Λ then the output of the relay is described by the following GLT distribution:
( )( )( )
( ) ( ) ( )( )
( ) ( )
( )( )
( ) ( )( )
1 2
1 2 1 2 1 2 1 2
1 21 2
1 2 1 2
1 2
1, 2 1 1, 2, 2, , ,
1 2 1,2,
1, 2 1 2, ,
,
d d
d d d d d d d d
d dd d
d d d d
x xx x
x x
β β β βλ
β β
Λ + Λ = Λ
∑
(5)
The LP for designing the mixing rules is obtained as follows
by substituting Eq. (5) in problem P3.
Problem 4 (P4): ( ) ( )* *
,
, arg maxΛ ∆
Λ ∆ = ∆
subject to 1,
, [0, )i
i t z∀ ∀ ∈
( )
( )( )1,0 2, 1
1
1, 1 1, 2 2,
, , 0i
i i i
zt t k t
z z z
αλ µ
α α
∆+ ≥ +
( )
( )0,1 2, 2,2
2
1, 1,1 1, 2 2,
, , 0i i
i ii i
z zt t k t
z zz z
αλ µ
α α
∆+ ≥ +
with ( ) ( ) ( ) [ ]1 2 1 2
(.) (.)
1 2 , ,, : 1, 0,1
d d d dd d∀ Λ = Λ ∈∑ and 0∆ ≥ .
V. NUMERICAL RESULTS
We considered a network with 1 750k =
and
2 1250k = ,
where the source 1 employs an LT code for an local non-
uniform demand of [0.6,0.8], while source 2 employs the LT
code for an local demand of [0.3, 0.7]. Thus, the distributions
employed at the sources are given by:
( ) 1 2 3
1 1 1 1 10.1 0.89 0.01x x x xβ = + +
(5)
( ) 1 2
2 2 2 20.57 0.43x x xβ = +
(6)
We consider that the relay is subscribed to by two set of sinks
with demands ( )1 0.95,0.95z = and ( )2 0.8,0.6z = .
The optimal mixing rules for the network were found to be:
( )( )
( )( )1 1
1,1 2,11, 0.0215 Λ = Λ =
( )( )
( )( )
( )( )
( )( )1,2 1,2 1,2 1,2
1,2 2,1 2,2 3,11, 0.9785, 0.0265, 0.8532Λ = Λ = Λ = Λ =
( )( )
( )( )
( )( )2 2 2
3,1 2,2 3,20.1468, 0.9735, 1Λ = Λ = Λ =
If the operation of the relay is restricted to simple mixing
than the optimal parameters were 1 2
0.12, 0.47p p= = and
30.41p = . In order to compare the various designs, we define
“failure to recover rate” as the probability with which a sink
fails to recover its desired demand. Figure 5 provides a
comparison of proposed morphing designs with simple mixing
0.290.61
0.930.29
0.61
0.93
2
7
12% ove
rhea
d with
out m
ixing
Figure 4 The percentage overhead without the generalized-terms as a
function of the demand (z1,z2).
2012 IEEE Information Theory Workshop
600
and multiplexing. It can be clearly observed that mixing at the
relay provides significant gains. The simple mixing as well as
the generalized morphing proposed in this paper significantly
outperform the forwarding based multiplexing strategy.
Furthermore, we can observe that generalized morphing when
compared with simple mixing, reduces the overhead by ~10%
for sink set 1 and by ~7% for sink set 2. Hence it can be
concluded that the degree sensitive XORing conducted by
morphing leads to improved designs compared to the
probabilistic mixing proposed in [16].
REFERENCES
[1] A. Eryilmaz, A. Ozdaglar and M. Medard, “On the Delay and Throughput Gains of Coding in Unreliable Networks”, IEEE Trans. on
Information Theory, vol. 54, no. 12, pp. 5511-5524, 2008.
[2] C. Fragouli and E. Soljanin, “Network Coding Fundamentals”, Foundations and Trends in Networking. Hanover, MA: now Publishers Inc., vol. 2, no. 1, pp. 1-133, 2007
[3] M. Luby, “LT codes”, 43rd
Annual IEEE Symposium on Foundations of Computer Science, 2002.
[4] A. Shokrollahi, “Raptor Codes”, IEEE Trans. on Information Theory, vol. 52, no. 6, pp. 2551-2567, June 2006.
[5] P. Maymounkov, “Online Code”, NYU Technical Report TR2003-883,
2002.
[6] J. W. Byers, M. Luby and M. Mitzenmacher, “A Digital Fountain Approach to Asynchronous Reliable Multicast”, IEEE J-SAC, 20 (8),
pp. 1528-1540, October 2002.
[7] S. Puducheri, J. Kliewer and T. Fuja, "The design and performance of distributed LT codes", IEEE Trans. on Information Theory, vol. 53, no. 10, pp. 3740-3754, October, 2007.
[8] B. Hajek, “Connections between network coding and stochastic network theory", Stochastic Networks Conference, June 19-24, 2006, Urbana.
[9] S. Sanghavi, “Intermediate Performance of Rateless Codes”, Information Theory Workshop 2007, Tahoe.
[10] A. Kamra, J. Feldman, V. Misra and D. Rubenstein, “Growth Codes:
Maximizing Sensor Network Data Persistence”, ACM Sigcomm, 2006.
[11] N. Rahnavard, B. N. Vellambi, and F. Fekri, “Rateless codes with unequal error protection property”, IEEE Trans. on Information
Theory, vol. 53, no. 4., pp. 1521-1532, April 2007.
[12] A. G. Dimakis, J. Wang and K. Ramchandran, "Unequal Growth Codes: Intermediate Performance and Unequal Error Protection for Video Streaming", MMSP, October 2007.
[13] S. S. Karande, “Network Channel Coding: Importance of in-network processing and side-information”, Ph.D. dissertation, Dept. of Elec.
Eng., Michigan State Univ., East Lansing, 2007.
[14] S S. Karande, K. Misra, S. Soltani and H. Radha, “Design and analysis of Generalized LT codes using colored ripples”, IEEE Symposium on Information Theory, 2008
[15] Y. Li and E. Soljanin, "Rateless Codes for Single-Server Streaming to Diverse Users", Allerton Conference, 2009.
[16] A. Talari and N. Rahnavard, "Distributed Rateless Codes with UEP Property", IEEE Symposium on Information Theory, 2010..
(a)
(b)
Figure 5 The failure recovery rate of (a) Sink set with demand [0.95,0.95] and (b) Sink set with demand [0.8,0.6].
2012 IEEE Information Theory Workshop
601