16
564 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998 Systematic Lossy Source/Channel Coding Shlomo Shamai (Shitz), Fellow, IEEE, Sergio Verd´ u, Fellow, IEEE, and Ram Zamir, Member, IEEE Abstract— The fundamental limits of “systematic” communi- cation are analyzed. In systematic transmission, the decoder has access to a noisy version of the uncoded raw data (analog or digital). The coded version of the data is used to reduce the average reproduced distortion below that provided by the uncoded systematic link and/or increase the rate of information transmission. Unlike the case of arbitrarily reliable error correc- tion for symmetric sources/channels, where systematic codes are known to do as well as nonsystematic codes, we demon- strate that the systematic structure may degrade the performance for nonvanishing We characterize the achievable average distortion and we find necessary and sufficient conditions under which systematic communication does not incur loss of optimality. The Wyner–Ziv rate distortion theorem plays a fundamental role in our setting. The general result is applied to several scenarios. For a Gaussian bandlimited source and a Gaussian channel, the invariance of the bandwidth–signal-to-nosie ratio (SNR, in decibels) product is established, and the optimality of systematic transmission is demonstrated. Bernoulli sources transmitted over binary-symmetric channels and over certain Gaussian channels are also analyzed. It is shown that if nonnegligible bit-error rate is tolerated, systematic encoding is strictly suboptimal. Index Terms— Gaussian channels and sources, rate-distortion theory, source/channel coding, systematic transmission, uncoded side information, Wyner–Ziv rate distortion. I. INTRODUCTION T HE advent of digital communications in radio and tele- vision broadcasting poses the following scenario. Histor- ically, a certain bandwidth has been allocated to the analog transmission of an information source. Then, additional chan- nel bandwidth becomes available over which it is possible to transmit digitally coded information. This “digital” channel can be used to transmit additional information bandwidth, boost the received fidelity of the original information source, or both. For the sake of back compatibility with existing equipment it may not be possible to convert the “analog” channel into a “digital” one, and the analog uncoded transmis- sion of the original source must be preserved. The principal question addressed in this paper is under what conditions does this restriction incur in no loss of capacity. Analogously, in systems where a direct satellite broadcast signal coexists with a terrestrial analog signal (e.g., DirecTV and DSS Manuscript received August 1, 1996; revised June 15, 1997. This work was supported by the U.S.–Israel Binational Science Foundation under Grant 92–00202, the NSF under Grant NCR95-23805, and the UC Micro program. S. Shamai is with the Electrical Engineering Department, Technion–Israel Institute of Technology, Haifa, 32000, Israel (e-mail: [email protected]. ac.il). S. Verd´ u is with the Electrical Engineering Department, Princeton Univer- sity, Princeton, NJ 08544 USA (e-mail: [email protected]). R. Zamir is with the Department of Electrical Engineering–Systems, Tel Aviv University, Tel Aviv 69978, Israel (e-mail: [email protected]). Publisher Item Identifier S 0018-9448(98)00838-4. Digital Satellite System) it is wasteful to simply discard the existing analog transmission in digital broadcast receivers. Through adequate signal processing and coding/decoding, the analog transmission can be used to lower the digital channel bandwidth required to achieve a given degree of fidelity. Such a reduction in bandwidth requirements can be quantified using the results of this paper. In the context of error-correcting channel coding, systematic codes are those where each codeword contains the uncoded in- formation source string plus a string of parity-check symbols; thus the multiplexing of coded and uncoded symbols takes place in the time domain. By extension, we call systematic those source/channel codes which transmit the raw uncoded source in addition to the encoded version. The aforementioned compatible analog/digital broadcasting systems are examples of systematic source/channel codes where the multiplexing of coded and uncoded versions takes place in the frequency domain. Notwithstanding the above analog digital motivation, the model of this paper has a wide scope. For example, as a consequence of our results, we characterize the bit-error rate achievable by systematic codes for the binary-symmetric channel and we show that when the source entropy is higher than the channel capacity, the bit-error rate is strictly higher than that achievable by nonsystematic codes. This is in contrast to the well-known fact [1], [2] that for any rate below capacity, systematic (linear or nonlinear) codes achieve the random coding error-exponent of the binary-symmetric channel. Another motivation for the model of systematic transmission discussed in this paper arises when a source is transmitted over a channel with unknown signal-to-noise ratio (SNR). In that setting, the systematic part of the transmission enables graceful degradation in reconstruction quality as the SNR decreases, while the digital part allows to boost the fidelity of the reconstruction whenever the SNR is sufficiently high. The information-theoretic problem we consider is depicted in Fig. 1. It is desired to reproduce the source with a certain fidelity at the output of the decoder using the outputs of the uncoded channel (“analog”) and the coded channel (“digital”). A separation theorem for source/channel coding with side information is given in Section II characterizing the sources/channels for which communication subject to a prescribed fidelity criterion is possible. For the sake of clarity and conciseness we limit our discussion to memoryless sources and channels as well as additive distortion measures. The achievability part of the separation theorem is a corollary of the Wyner–Ziv direct source coding theorem with side information [4], [5]. The capacity of channels with side information found in [3] corresponds to the zero-distortion 0018–9448/98$10.00 1998 IEEE

Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

  • Upload
    hathien

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

564 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

Systematic Lossy Source/Channel CodingShlomo Shamai (Shitz),Fellow, IEEE, Sergio Verdu, Fellow, IEEE, and Ram Zamir,Member, IEEE

Abstract—The fundamental limits of “systematic” communi-cation are analyzed. In systematic transmission, the decoder hasaccess to a noisy version of the uncoded raw data (analog ordigital). The coded version of the data is used to reduce theaverage reproduced distortion D below that provided by theuncoded systematic link and/or increase the rate of informationtransmission. Unlike the case of arbitrarily reliable error correc-tion (D ! 0) for symmetric sources/channels, where systematiccodes are known to do as well as nonsystematic codes, we demon-strate that the systematic structure may degrade the performancefor nonvanishing D: We characterize the achievable averagedistortion and we find necessary and sufficient conditions underwhich systematic communicationdoes notincur loss of optimality.The Wyner–Ziv rate distortion theorem plays a fundamental rolein our setting. The general result is applied to several scenarios.For a Gaussian bandlimited source and a Gaussian channel,the invariance of the bandwidth–signal-to-nosie ratio (SNR, indecibels) product is established, and the optimality of systematictransmission is demonstrated. Bernoulli sources transmitted overbinary-symmetric channels and over certain Gaussian channelsare also analyzed. It is shown that if nonnegligible bit-error rateis tolerated, systematic encoding is strictly suboptimal.

Index Terms—Gaussian channels and sources, rate-distortiontheory, source/channel coding, systematic transmission, uncodedside information, Wyner–Ziv rate distortion.

I. INTRODUCTION

T HE advent of digital communications in radio and tele-vision broadcasting poses the following scenario. Histor-

ically, a certain bandwidth has been allocated to the analogtransmission of an information source. Then, additional chan-nel bandwidth becomes available over which it is possible totransmit digitally coded information. This “digital” channelcan be used to transmit additional information bandwidth,boost the received fidelity of the original information source,or both. For the sake of back compatibility with existingequipment it may not be possible to convert the “analog”channel into a “digital” one, and the analog uncoded transmis-sion of the original source must be preserved. The principalquestion addressed in this paper is under what conditionsdoes this restriction incur in no loss of capacity. Analogously,in systems where a direct satellite broadcast signal coexistswith a terrestrial analog signal (e.g., DirecTVand DSS

Manuscript received August 1, 1996; revised June 15, 1997. This workwas supported by the U.S.–Israel Binational Science Foundation under Grant92–00202, the NSF under Grant NCR95-23805, and the UC Micro program.

S. Shamai is with the Electrical Engineering Department, Technion–IsraelInstitute of Technology, Haifa, 32000, Israel (e-mail: [email protected]).

S. Verdu is with the Electrical Engineering Department, Princeton Univer-sity, Princeton, NJ 08544 USA (e-mail: [email protected]).

R. Zamir is with the Department of Electrical Engineering–Systems, TelAviv University, Tel Aviv 69978, Israel (e-mail: [email protected]).

Publisher Item Identifier S 0018-9448(98)00838-4.

Digital Satellite System) it is wasteful to simply discard theexisting analog transmission in digital broadcast receivers.Through adequate signal processing and coding/decoding, theanalog transmission can be used to lower the digital channelbandwidth required to achieve a given degree of fidelity. Sucha reduction in bandwidth requirements can be quantified usingthe results of this paper.

In the context of error-correcting channel coding,systematiccodes are those where each codeword contains the uncoded in-formation source string plus a string of parity-check symbols;thus the multiplexing of coded and uncoded symbols takesplace in the time domain. By extension, we call systematicthose source/channel codes which transmit the raw uncodedsource in addition to the encoded version. The aforementionedcompatible analog/digital broadcasting systems are examplesof systematic source/channel codes where the multiplexingof coded and uncoded versions takes place in the frequencydomain.

Notwithstanding the above analog digital motivation,the model of this paper has a wide scope. For example, asa consequence of our results, we characterize thebit-errorrate achievable by systematic codes for thebinary-symmetricchanneland we show that when the source entropy is higherthan the channel capacity, the bit-error rate isstrictly higherthan that achievable by nonsystematic codes. This is in contrastto the well-known fact [1], [2] that for any rate below capacity,systematic (linear or nonlinear) codes achieve the randomcoding error-exponent of the binary-symmetric channel.

Another motivation for the model of systematic transmissiondiscussed in this paper arises when a source is transmittedover a channel with unknown signal-to-noise ratio (SNR). Inthat setting, the systematic part of the transmission enablesgraceful degradation in reconstruction quality as the SNRdecreases, while the digital part allows to boost the fidelityof the reconstruction whenever the SNR is sufficiently high.

The information-theoretic problem we consider is depictedin Fig. 1. It is desired to reproduce the source with a certainfidelity at the output of the decoder using the outputs ofthe uncoded channel (“analog”) and the coded channel(“digital”). A separation theorem for source/channel codingwith side information is given in Section II characterizingthe sources/channels for which communication subject to aprescribed fidelity criterion is possible. For the sake of clarityand conciseness we limit our discussion to memoryless sourcesand channels as well as additive distortion measures. Theachievability part of the separation theorem is a corollaryof the Wyner–Ziv direct source coding theorem with sideinformation [4], [5]. The capacity of channels with sideinformation found in [3] corresponds to the zero-distortion

0018–9448/98$10.00 1998 IEEE

Page 2: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 565

Fig. 1. Information transmission with uncoded side information.

special case of the result in Section II. The Slepian–Wolftheorem [6] on separate coding of correlated sources is used in[3] to show that if the capacity of channel is larger/smallerthan the conditional entropy of the source given the output ofchannel , then arbitrarily reliable transmission of the sourceis feasible/impossible. The separation theorem of SectionII replaces the conditional entropy by the Wyner–Ziv rate-distortion function evaluated at the prescribed distortion level.From this separation theorem, we derive the necessary andsufficient condition for the optimality of systematic encoding:coding of channel is superfluous to achieve the optimaldistortion/information rate tradeoff, if and only if the followingthree conditions are satisfied.

1) The source maximizes the input–output mutual infor-mation of channel .

2) Having the output of channel as side information atthe transmitter is superfluous for encoding the source atthe desired distortion.

3) The output of channel is maximally useful1 at thedecoder in order to reduce the rate required to encodethe source at the desired distortion.

Note that due to the prohibition of coding in one ofthe channels (channel ), the setting of this paper is notencompassed by the conventional frameworks of broadcastchannels and multiple descriptions [7].

Section III applies the results of Section II to the special caseof Gaussian sources and channels with a mean-square-errorfidelity criterion. In the setting where the channel bandwidthis equal to the source bandwidth and all spectra are flat,it is well known [8] that it is optimal not to encode theGaussian source in any way and to decode the source witha simple attenuator. As a practical matter, this means thatsingle-sideband (SSB) modulation of a Gaussian source isoptimal for white Gaussian noise channels [9], [10]. Whenthe channel bandwidth exceeds the source bandwidth, codingof the source obviously becomes necessary in order to takeadvantage of the additional channel bandwidth for the sakeof improved fidelity. However, systematic encoding turns outto be optimal. This implies that, for Gaussian sources andchannels, back compatibility incurs no penalty in the fidelitywith which the source can be reproduced by a receiver whichobserves both the analog and digital channels. For a givenbandlimited channel, the higher the bandwidth of the source,the lower the SNR achievable at the output of the decoder.

1In the sense formalized in Section II.

Under the assumption that the allowed transmission power inany given frequency band is proportional to its bandwidth,we find that for the ideally bandlimited Gaussian channel,the tradeoff information bandwidth/output SNR is particularlysimple: the product of the output SNR in decibels times thetransmissible information bandwidth is a constant equal to thechannel bandwidth times the SNR in decibels achievable withuncoded transmission.

Sections IV and V deal with the important case of bi-nary information sources and the bit-error rate as the dis-tortion criterion. Binary-symmetric channels are studied inSection IV, while Section V considers antipodal modulationin Gaussian channels. Systematic encoding is shown to bestrictly suboptimal for both binary-symmetric channels andbinary-input Gaussian channels. Wyner’s [13] interpretation ofSlepian–Wolf coding is extended by the explicit construction(proposed in Section IV) of Wyner–Ziv codes for the binarysource/channel problem based on channel error-correctingcodes.

Section V also considers the case where the uncoded andcoded channels are mutually interfering. In this alternativescenario, it is assumed that no additional bandwidth or powerare available to boost performance. Instead, a percentage ofthe power originally assigned to the transmission of a binaryuncoded source through a Gaussian channel is devoted tothe coded transmission superposed to the uncoded transmis-sion. Increasing the power of the coded signal enhances theperformance of an optimal decoder while it degrades theperformance of a receiver for the uncoded transmission. Thetradeoff given in Section V generalizes the case of arbitrarilyreliable optimum decoding treated in [3].

II. SEPARATION THEOREM FOR LOSSY

SOURCE/CHANNEL CODING WITH SIDE INFORMATION

Consider the situation depicted in Fig. 1 where an in-formation source is transmitted to a decoder via twoindependent channels, only one of which (channel) isallowed to be preceded by an encoder. The objective is toreproduce the source by at the output of the decoderwithin some prescribed distortion. Had we allowed an encoderpreceding channel , then the conventional separation theoremfor lossy source/channel coding [7] states that distortionis achievable/not achievable if the rate-distortion function ofthe source lies below/above the sum of thecapacities of the respective channels.

Page 3: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

566 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

In the case of noiseless (arbitrarily reliable) transmission, theseparation theorem of [3] states that reliable transmission ofthe source in the setting of Fig. 1, is possible/not possible if itsentropy lies below/above , where is themutual information between the input and output of channel.The essential part of the proof of the achievability part of thisresult from [3] is the Slepian–Wolf theorem [6] (and its ergodicextension [7]), which enables the decoupling of the encoderinto a Slepian–Wolf source encoder and a channel encoder(and likewise at the decoder). The converse result of [3] showsthat such a decoupling entails no loss of optimality. It followsthat in the distortionless case the condition for optimality ofsystematic encoding is that the source maximize the mutualinformation of channel , i.e.,

In order to state the separation theorem for lossysource/channel coding with side information we will needa few definitions.

Definition: For a joint distribution define the mini-mum possible estimation error of given by

(2.1)

where is a distortion measure.Definition: Let the Wyner–Ziv rate-distortion function ofgiven that the decoder observesbe [11]

(2.2)

where denotes that and are conditionallyindependent given

Theorem 2.1:Let the source and channels A and D bememoryless.

a) If

(2.3)

then blocklength- encoders/decoders exist for all sufficientlylarge such that the source can be reproduced at the outputwith distortion

(2.4)

b) If , then no such coding scheme can exist.Remarks: As usual in separation theorems, the case where

capacity and rate-distortion function are equal is not includedsince, in that case, the feasibility of transmission dependson the source and channel. Extension of Theorem 2.1 to aninformation-stable channel is straightforward.

Proof:

1) The proof of achievability follows immediately by con-sidering a Wyner–Ziv code (independent of channel)followed by a transmission code (independent of ).The achievability parts of the Wyner–Ziv theorem forgeneral memoryless sources [5] and the conventionalchannel coding theorem [7] imply that a sequence ofoptimal codes exist as long as (2.3) is satisfied.

2) Consider any encoding/decoding scheme and denote thesource word of length by ; its encoded version(transmitted by channel ) by ; the output due to

by ; the output of channel due to by ;and the output of the decoder by (see Fig. 1).

Furthermore, define

Note that the following Markov properties hold:

(2.5)

(2.6)

Consider the following chain of inequalities:

(2.7a)

(2.7b)

(2.7c)

(2.7d)

(2.7e)

(2.7f)

(2.7g)

(2.7h)

(2.7i)

(2.7j)

(2.7k)

which are justified by

(2.7a) the channel coding theorem,(2.7b) data processing theorem and (2.5),(2.7c) by (2.5),(2.7d, e, f) chain rule of mutual information,(2.7g) both the source and channelare memoryless,(2.7h) (2.2) and (2.6),(2.7i) is decreasing and further conditioning de-

creases (cf. (2.1)),(2.7j) (2.1) and is a function of and ,(2.7k) convexity of [4] and (2.4).

We proceed to derive the condition for optimality of sys-tematic coding from Theorem 2.1. Suppose that distortionis the best distortion achievable in a fully coded system (wherechannel is preceded by an encoder). From the conventionalseparation theorem for lossy source/channel coding,is thesolution to

(2.8)

Page 4: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 567

where the rate-distortion function of the source is

(2.9)

We will add and subtract the conditional rate distortionfunction (which corresponds to the case where theside information is available also at the encoder [12]), and

and from the right-hand side of (2.8) toyield

(2.10)

We will now show that each of the four terms in parenthesisin the left side of (2.10) is nonnegative. Therefore, (2.8) issatisfied if and only if every term is equal to. The firstterm is nonnegative by definition (not having side informationat the encoder cannot reduce the required rate); the secondterm is nonnegative because of Theorem 2.1; the third term isnonnegative because is equal to the maximal input–outputmutual information of channel ; the fourth term in (2.10)is nonnegative as can be seen as follows. The conditionalrate-distortion function is given by (see Appendix I and [7])

(2.11)

Using (2.11) and the chain rule for mutual information, weobtain

(2.12)

(2.13)

where the inequality follows by lower-bounding the penaltyfunction and enlarging the optimization set. Thus we concludethat side information at both encoder and decoder can reducethe rate required to encode by at most We notethat inequality (2.13) can be found in [12]. Equality is achievedin (2.13) if and only if for the optimizingdistribution in (2.12).

We have thus established a necessary and sufficient condi-tion on the source and channel for the optimality of systematiclossy source/channel encoding, which consists of the followingthree simultaneous conditions.

C1. The source maximizes the mutual information of theuncoded channel, i.e.,

(2.14)

C2. The output of the channel due to the uncoded sourceis not needed at the source encoder, i.e.,

(2.15)

C3. The output of the channel due to the uncoded sourceis maximally useful at the source decoder, i.e.,

(2.16)

It is interesting to note that when condition C2 is notsatisfied, is typically small as shownin [11]. In the special case of zero distortion treated in [3]conditions C2 and C3 are always satisfied.

In the next section we show an important special casewhere the conditions for optimality of systematic encodingare satisfied.

III. GAUSSIAN SOURCE THROUGH GAUSSIAN CHANNEL

Suppose that a Gaussian continuous-time source ,with flat power spectral density strictly bandlimited to band-width , is transmitted (uncoded) through a channel with flatspectral response and additive white Gaussian noisewith single-sided power spectral density Suppose thatthe transmitted signal single-sided power spectral density isconstrained to In this uncoded scenario, the receiver thatminimizes the mean-square error (MSE) is an attenuator

(3.1)

with

(3.2)

The resulting output signal-to-noise ratio is

SNR

(3.3)

It is well known [8]–[10] that if the bandwidth of the channelcoincides with the bandwidth of the source , then thesignal-to-noise ratio obtained in (3.3) cannot be improved bycoding the input signal. This can be immediately checkedby noticing that the conventional source–channel separationtheorem requires that the rate-distortion function not exceedchannel capacity, i.e.,

SNR (3.4)

thus SNR is the largest SNR for which (3.4) is satisfied.Suppose now that we are givenadditional bandwidth

and we want to explore the enhancement of the signal-to-noise ratio achievable thanks to this bandwidth expansion. Asbefore, we constrain the power spectral density transmittedthrough the channel to We will explore two possibilities:

1) Nonsystematic: the original uncoded system is scrap-ped and the full bandwidth of the channel, , isoccupied by the encoded signal.

2) Systematic: the uncoded system is retained (source issent directly through the channel) and only the addi-tional bandwidth is used to transmit the encodedinformation.

Page 5: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

568 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

In scenario 1) it is easy to compute the achievable outputsignal-to-noise ratio SNRby equating rate-distortion functionand channel capacity via the conventional separation theorem(cf. [8])

SNR (3.5)

i.e.,

SNR SNR (3.6)

which admits the simple interpretation that the improvementof the signal-to-noise ratio in decibels is a multiplicative factorequal to the bandwidth expansion factor, i.e., if the bandwidthis doubled so is the signal-to-noise ratio figure in decibels(recall SNR

The analysis of the systematic scenario 2) will lead to thesame conclusion via the results in Section II. In particular,we will check that the conditions for optimality of systematicencoding are satisfied in the present scenario:

C1. It is clear that the source maximizes the input–outputmutual information of channel among all sources of power

(consider, for example, the equivalent discrete-timechannel sampled at the Nyquist rate

(3.7)

C2. According to [5], in the case of a Gaussian memorylesssource with side information equal to the source plus Gaussianwhite noise, the Wyner–Ziv rate-distortion function is given by

(3.8)

C3. The value of side information is maximally useful atthe decoder

(3.9)

The conclusion is that in order to maximize the outputfidelity it is optimal to transmit the uncoded Gaussian sourcedirectly through the channel and devote only the excessbandwidth to the transmission of the encoded information. Thesignal-to-noise ratio enhancement is a function of the band-width expansion given by (3.6). We emphasize that the burdento achieve such full bandwidth utilization falls exclusively onthe digital encoder and on the decoder for the analog–digitalreceiver; the existing analog transmitters/receivers are unaf-fected. In existing analog broadcast systems for radio andtelevision, the analog channel does not quite lend itself tofull exploitation of its bandwidth (through suitable codingof the additional channel) because the transmitted signals donot follow the flat spectrum Gaussian model assumed in thiswork, and, of course, vestigial sideband (television) and doublesideband (radio) are inherently wasteful of bandwidth/power.The computation of the Wyner–Ziv rate distortion function fora double-sideband-modulated flat Gaussian source is simple:

wasting half of the bandwidth buys a 3-dB enhancement ofthe analog channel SNR. This follows from the fact that in theGaussian case parallel independent side information channelsare equivalent to a single side information channel whose SNRis equal to the sum of the individual SNR’s.

It is interesting to note that the Wyner–Ziv encoding in theGaussian case can be implemented in a simple way, usinga Gaussian source codebook at the encoder, Slepian–Wolfencoding/decoding, and a linear transformation at the decoder.This follows from the analysis of this case in [5] and thesection on universal quantization with side information in [11].Specifically, the encoder uses a codebook (implementable withan entropy-coded dithered quantizer [11]) which is optimalfor encoding the Gaussian source with a target signal-to-noiseratio of SNR SNR , where SNR is thefinal output signal-to-noise ratio given in (3.6). (Note thatSNR SNR .) At the decoder, the Slepian–Wolfcode is decoded (given the side information), the result isscaled, and then an appropriately scaled version of the sideinformation is subtracted.

The scenario where the digital channel is used for SNRenhancement without bandwidth boosting arises in satellite-enhanced reception, which may be of interest in improvingterrestrial broadcast reception quality in remote or mountain-ous regions where the received SNR of the analog systemis low. This requires the consideration of a more generalcase where the SNR’s of the analog and digital channels arenot necessarily equal. This is very easy to incorporate intothe analysis by, once again, equating the Wyner–Ziv rate-distortion function with the digital channel capacity. Shannon’sformula states that the capacity of the ideally bandlimiteddigital channel is SNR , where SNR is equal theSNR at the channel output plus one (which according to (3.3)is equal to the optimum SNR of an estimator of the channelinput from the channel output). Thus we can write

SNR SNR

SNR (3.10)

where SNR is the achievable output SNR of a receiverthat processes both channels jointly. Thus (3.10) results in theconclusion that the improvement of SNR in decibels due tothe digital channel is proportional to its bandwidth and to itsachievable SNR

SNR SNR

SNR

As a point of comparison, note that if the desired out-put SNR is equal to the SNR of the digital channel, i.e.,SNR SNR , then a stand-alone digital system wouldrequire bandwidth , whereas by taking advantage of theanalog channel we only need a digital channel bandwidth equalto

SNRSNR

Page 6: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 569

Fig. 2. Systematic transmission of a Gaussian source through a Gaussian channel.

The foregoing results allow us to deal with the moregeneral setting where the additional channel bandwidthis used not exclusively to enhance signal-to-noise ratio butto transmit additional information bandwidth We areinterested in finding the tradeoff between the transmissiblesource bandwidth and its decoded signal-to-noiseratio SNR , as a function of and the overall bandwidthof the channel If the target signal-to-noise ratioSNR is equal to the original one SNR , thenthe maximum excess bandwidth, , that can be transmittedis equal to the additional channel bandwidth As we sawbefore, this can be accomplished with no coding whatsoever.If SNR SNR thenIf SNR lies strictly between SNR and SNR then theachievable excess bandwidth is computed as follows.

Let us consider a communication scheme as depicted inFig. 2, where the additional channel bandwidth is dividedinto

(3.11)

where is devoted to transmitting an encoded versionof the original information bandwidth for the sake ofits signal-to-noise ratio enhancement; and is devotedto the encoded transmission of the excess information band-width Let the target signal-to-noise ratio be SNRSNR SNR Then, as we saw in (3.5), the necessary

bandwidth to support SNR is

SNR SNR (3.12a)

or, equivalently,

SNRSNR

(3.12b)

The remaining channel bandwidth cansupport the following information bandwidth at signal-to-noise

ratio SNR :

(3.13)

Using (3.11) and (3.13) we obtain the sought-after tradeoff:

SNRSNR

(3.14)

Equation (3.14) implies that the bandwidth reduction factoris equal to the ratio of signal-to-noise ratios in decibels.This generalizes the conclusion obtained in (3.6): the productinformation bandwidth times signal-to-noise ratio in decibelsremains constant. Again, this can be achieved using systematicencoding of the original information bandwidth

What if we are willing to tolerate a signal-to-noise ratioworse than the original SNR for the sake of bandwidthexpansion beyond that offered by the channel? Then, sys-tematic coding of any subband of the information sourceis strictly suboptimal. The conventional theory leads to theconclusion that the product of the transmitted signal bandwidth

times the decoded signal-to-noise ratioSNR (in decibels) must be equal to the channel bandwidth

times the uncoded signal-to-noise ratio, asin (3.14).

SNR (3.15)

In Fig. 3, (3.15) is depicted in terms of the decoded signal-to-noise ratio SNR (in decibels) versus the transmitted infor-mation bandwidth ; the region of optimality of systematictransmission is explicitly indicated.

IV. BERNOULLI SOURCE THROUGH BINARY CHANNEL

In this section, we examine a special case of practicaland theoretical interest of the general framework developedin Section II (Fig. 1). The source is a Bernoulli symmetric

Page 7: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

570 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

Fig. 3. Tradeoff output SNRC versus output bandwidthBI for Gaussian source and channel.

source; the distortion criterion is the bit-error rate (Hammingdistance); channel A2 is a binary-symmetric channel (BSC)with crossover probability ; and channel is a BSC withcrossover probability

The conventional source–channel separation theorem im-plies that the minimum bit-error rate with full encodingof both channels, is determined by (c.f. (2.8))

(4.1)

where is equal to the capacity of a BSC withcrossover probability is the binary entropy function inbits, and is the rate-distortion function ofthe Bernoulli symmetric source with Hamming distortion

According to Theorem 2.1, when the systematic uncodedpart is transmitted over channel, the minimum achievablebit-error rate is determined by

where is the Wyner–Ziv rate distortion function fora Bernoulli source with a BSC (with crossover probability)side information channel. was obtained by Wyner andZiv in [4, Sec. II] (referred to therein as the doubly-symmetricbinary source)

(4.2a)

(4.2b)

where stands for the binary convolutionand is the solution of the equation

(4.2c)

2We maintain here the notations channelA and channelD, though in thepresent casedigital uncoded information is transmitted through channelA.

A simple derivation of is given in [11] in terms ofthe additive-noise rate-distortion function.

The special case where both channels are identicalis considered in Fig. 4, which shows the achievable

bit-error rate for systematic and nonsystematic rate-codingas a function of the crossover probability of the channels.

It is noted that for reliable communication , no lossis incurred by systematic encoding as is already known [2],[3]. However, we see in Fig. 4, that for any bit-error rate in

systematic encoding is strictly suboptimal. For anyand (not necessarily equal) two out of the three required

equalities are satisfied

and

Only the fact that

for all accounts for the suboptimality of systematiccoding.

We now propose a constructive approach to achieve theWyner–Ziv rate distortion function in (4.2). This ap-proach (for which we do not provide a full proof) is a substan-tive generalization of Wyner’s construction of Slepian–Wolfcodes from linear block channel codes [13]. We will assumethat channel is noiseless, i.e., , withuses (on the average) per each source sample, for otherwise,conventional channel codes can be employed to convey theoutput of the source encoder to the receiver with arbitraryreliability.

We will focus on bit-error rates ; forthe strategy that follows should be time-shared with no coding.Choose two linear codes defined by parity-check matrices

and such that

(4.3)

Page 8: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 571

Fig. 4. Minimum bit-error rate achievable at rate12

—above the capacity of the binary-symmetric channel.

and Code 2 is a subcode of Code 1. Thus the parity-checkmatrices and satisfy

- - - (4.4)

Every codeword of Code 1 satisfies

(4.5a)

where the superscript stands for the transpose operation. If,in addition,

(4.5b)

then is also a codeword of Code 2. The decoders for thesecodes are defined by functions of the corresponding syndromes

(4.6)

where , where denotes modulo- addition.According to well-known properties of optimal linear codesand classical random linear coding arguments, it follows thatthe codebooks and decoding functions of Codes 1 and 2 canbe chosen so that

(4.7)

for most of the realizations of a Bernoulli process withprobability of , and for most of the realizations of a

Bernoulli process.TheWyner–Ziv encodingof the binary source word consists

of two steps.

1) Among the codewords in Code 1 select which isclosest to in Hamming distance. Let

2) Output the vector

Note that the output rate of the Wyner–Ziv encoder is

(4.8)

The decoder receives and the output of the BSC dueto which we denote by

(4.9)

Since is a codeword of Code 1, and

- - - - (4.10)

The decoder outputs the-codeword

(4.11)

If were a Bernoulli process, the output of thedecoder would yield with high probability

(4.12)

Page 9: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

572 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

in which case the decoder obtains a distorted version ofthe input within distortion (bit-error rate) , as desired.By assumption, is Bernoulli and is independentof Furthermore, for increasingly long blocklength, thedistribution of will resemble those of independent binarytrials with parameter This can be expected by the backwardchannel interpretation of the rate-distortion function and theasymptotic rate-distortion optimality of linear block codes forthe binary/Hamming case [14, Sec. VII.3] which imply that

resembles the noise process of the backward channel (aBSC in this case).

V. BERNOULLI SOURCE THROUGH GAUSSIAN CHANNEL

In this section, channel is an additive Gaussiannoise discrete-time channel (variance with binaryantipodal inputs. Channel is an additive Gaussiannoise discrete-time channel (variance ) with -power-constrained (continuous) input, operating at a rate ofchannel uses per source information bit. The distortionmeasure remains the bit-error rate.

The minimum bit-error rate achievable when channel iscoded is given by the conventional source/channel separationtheorem

SNR SNR (5.1)

where SNR SNR are, respectively,the signal-to-noise ratios of channels and , andand are the capacities of the Gaussian channel withbinary inputs and average power-constrainedinputs,respectively,

(5.2)

(5.3)

If channel is connected directly to the source, then theachievable bit-error rate is given by (Theorem 2.1)

SNR SNR (5.4)

where the right-hand side of the inequality is the capacity persource information bit of channel . The functionstands for the Wyner–Ziv rate distortion of a Bernoulli sourcewhere the side information is given by the outputs of a binary-input Gaussian channel driven by the uncoded source withsignal-to-noise ratio SNR In Appendix II, it is shown that

is equal to the lower convex envelope3 ( )

(5.5)

3An alternative description is given in terms of an auxiliary time-sharingrandom variable. See [11] and Appendix II (II.7).

of the function which is defined in a parametricform with the parameter via

(5.6a)

(5.6b)

Note that the distortion achievable at zero rate is thebit-error rate of the uncoded channel

SNR

Thus

In Fig. 5, we illustrate (5.1) and (5.4) in the case SNRSNR SNR and

As we would expect, systematic coding is strictly sub-optimal for SNR In this case, onlyone of the three required equalities in (2.10) is satisfied,i.e., (condition C1), as equiprobable inputsmaximize the capacity of the binary-input Gaussian channel.Thus even in the hypothetical case in which the uncodedchannel output were available at the encoder, the performancewould still be suboptimal since condition C3 is not satisfied.

In the second part of this section we examine the caseof overlaid communication over a single Gaussian channel,where neither additional bandwidth nor additional power areallocated to boost performance, see [3]. Degradation of com-munication performance over the existing channelis tradedoff for enhanced performance of a receiver which accesses theoutputs of both channels and . In this scenario channeland are, in fact, induced by the same physical channel overwhich the overlaid communication is operated.

The block diagram of the system is shown in Fig. 6, wherethe original power is divided into (assigned to theoverlaid coded system) and (left for the use ofthe uncoded system). The coding on channel consistsof Wyner–Ziv source encoding followed by optimal channelcoding.

The overlaid communication link has been analyzed in [3]where the goal was to provide arbitrarily reliable communi-cation for the enhanced system at the expense of degradingperformance of the uncoded link. It was shown in [3] thatminimizing the effect on the uncoded channel while preferringarbitrarily reliable communication at the highest possible rate

Page 10: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 573

Fig. 5. Minimum bit-error rate for transmission of a Bernoulli source via a Gaussian channel.

Fig. 6. Overlaid encoded communication of a binary source on a Gaussian channel.

requires that be chosen as the smallest value satisfying

(5.7)

and the corresponding performance of the uncoded link isdegraded from to

(5.8)

We will extend the result of [3] on overlaid communicationby considering anad hocscheme which trades off degradationof the error probability of the uncoded link for improvementof the error probability of the coded link. The optimal tradeoffbetween these error probabilities with the overlayed commu-nication scheme where no additional power or bandwidth isavailable and therefore the encoder must share resources withthe uncoded raw information is unknown. This is why weadhere to a specific system which offers a reasonable tradeoff.

In the extreme case, where the error probability of the codedsystem vanishes, our result particularizes to [3], whereas in

the other extreme, where no power is allocated to the codedsystem , we get

(5.9)

The performance in [3] was achieved by an overlaid look-ahead communication scheme which guarantees no interfer-ence from the uncoded part to the coded part. This is no longerfeasible when distortion is allowed since the receiver cannotreplicate the source noiselessly. In this case, we devise a moreinvolved version of the look-ahead encoding, as described inthe following.

We now examine two different strategies. The first, whichis somewhat easier to describe, is usually useful only when aminuscule degradation in the performance of the uncoded linkis allowed. In this case the channel coded part (channelcoding only) gets first decoded reliably and is canceled out.Since reliable decoding of channelcodes is possible as longas the capacity of channel is not exceeded, the interferencefrom the coded part to the uncoded systematic transmissioncan be eliminated when we consider the uncoded channel as

Page 11: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

574 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

Fig. 7. Bernoulli source through Gaussian channel. Dotted/solid line is strategy 1/2.

the side-information channel. The coded transmission howeversuffers from the interference due to the uncoded communi-cation. Thus the signal-to-noise ratios over the uncoded andcoded channels are given, respectively, by

SNR

SNR (5.10)

The capacity of channel is lower-bounded by

SNR SNR (5.11)

where the lower bound is obtained by replacing the binaryinterference by the worst case Gaussian interference with thesame power

If we assume a Gaussian-distributed random codebook forchannel codes and if the Euclidean decoding metric isadopted, then as shown in [15], the right-hand side of (5.11)constitutes indeed the highest possible reliably transmitted rate,under these mismatched metric conditions and random coding.Since the uncoded transmission for the standard existing re-ceiver is contaminated by the coded transmission, the resultingbit error probability increases to

(5.12)

Note that the coded transmission resembles Gaussian noise as a“Gaussian” codebook is used. The degradation in performanceas compared to SNR , where SNRfollows by the signal power reduction on one hand andnoise enhancement due to the coded part on the other. Theachievable distortion of the upgraded system which decodes

the information on the basis of both the coded and systematicparts of the transmission is characterized by Theorem 2.1

SNR (5.13)

where channel —the side-information channel enjoysSNR SNR The tradeoff parameter is ,the power allocated for the coded transmission. The curve of

versus for this strategy is shown in Fig. 7 where it isproduced in a parametric form, with being the parameter.

We turn now to the second approach in which the interfer-ence caused by the uncoded transmission to the coded part isto be reduced. Towards this end we shall employ a versionof look-ahead encoding. The input information bits of thesource are grouped into superblocks, where each superblockconsists of blocks of raw information bits and whereeach block comprises bits of information. The blocks areencoded separately by the Wyner–Ziv source encoder whichproduces about bits and then re-encoded by anappropriate channel code of length The consecutiveblocks of the coded information are grouped together andform a coded superblock. Theth superblock of uncodedraw information is interleaved by an interleaver andtransmitted. Superposed on this transmission is thecoded superblock, which constitutes the look-ahead part ofthe encoding.

The th superblock of the uncoded information is receivedby the decoder when its coded counterpart is already availableas it has been superimposed on the th unencodedsuperblock. The uncoded transmission suffers the interferencefrom the coded part which again can be assumed Gaussian,as we specialize here to a “Gaussian” codebook [7], [15].

Page 12: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 575

Therefore, the SNR in the uncoded channel is

SNR

which accounts for both the signal power reduction andthe noise enhancement due to the coded transmission. Incontrast to the case examined in [3], no ideal cancellationof the interference due to the systematic transmission ispossible, as in general the achievable bit-error rateis notnegligible. Nevertheless, the channel code can partially reducethe interference, by using the Wyner–Ziv source code that isassociated with the interfering systematic transmission. Let

and denote the coded, the systematic,and the Gaussian noise components at the transmission of theth block, having powers and respectively.

According to the look-ahead strategy, at the time blockistransmitted, block is already available for the digitallink. Thus the discrete-time channel viewed by the codedtransmission is given by

is available (5.14)

The block carries the Wyner–Ziv code for the sourceblock For an optimal Wyner–Ziv coder, the statisticalrelation between and corresponds to the Wyner–Ziv“test” channel between and ; see definition (2.2) and[11]. In fact, using standard random coding arguments anddue to the interleaving, the channel in (5.14) can be rewrittenin terms of single-letter quantities, as

is available (5.15)

where is the random variable (jointly distributed with) which achieves the Wyner-Ziv function (2.2), and where

are independent. Clearly, the channel encoder canhope to achieve the conditional capacity of this channel given

We proceed by finding a lower bound for this capacity.For that, we assume that the code is Gaussian (i.e., aGaussian-like codebook is employed), and white (due to theinterleaver), so the analog (side-information) channel (channel

) is effectively an Additive White Gaussian Noise (AWGN).For this case, Appendix II shows that the Wyner–Ziv functionis achieved by a binary-symmetric channel(or by a convex combination of such channels), whereisindependent of , and the crossover probability is determinedby the overall bit-error rate via the parameter

(5.6). Now, for convenience, replace the modulo-sumabove by the equivalent real multiplication, assuming that

and are bipolar variables. Substitutingin (2.2), and manipulating the condition by multiplying thereceived signal by , while noting that , weobtain the equivalent channel

is available. But since and are symmetricand independent, the random variables andare statistically independent of both and , and aredistributed as and , respectively. We thus conclude thatthe channel viewed by the coded transmission is effectively

(5.16)

Employing, as mentioned above, a Gaussian codebook for,we can achieve at least the Gaussian capacity (i.e., the capacitycalculated as if the bipolar variable were Gaussian with thesame variance) [15]

(5.17)

where is determined by the overall bit-errorrate ; see Appendix II and (5.6). This is the desired lowerbound for the capacity of the digital link.

The distortion is now determined by the leastfor whichthe equation

(5.18)

holds (note that depends on andon the Gaussian side-information channel (channel) withSNR SNR The degraded performance of the systematicuncoded transmission equals in both strategies (5.8), i.e.,

The plot of versus is shown in Fig. 7 for severalSNR values. It is noted that when must beselected close to , the first approach is usually preferablewhile in cases where smallis the target, the second approachis mostly advantageous. In fact, when SNR is lessthan (3 dB), the first approach dominates the second onefor all values of , while for higher SNR values there isa crossover point and for larger values of , the secondapproach is advantageous. For SNR larger than(7 dB) the second approach dominates. The second approachreveals an interesting anomaly in performance for small valuesof where and increase together. This occurs becausethe degradation suffered by the uncoded channel (as reflectedby SNR is not counterbalanced by the capacity of thedigital channel (channel ) as given in (5.17). In contrast, theperformance of the first approach is monotonic. Note that whenenough is allocated so that (5.7) is satisfied, andfully reliable performance of the enhanced system becomespossible. By time-sharing both approaches, the lower convexenvelope of the two curves in Fig. 7 reflects the tradeoffbetween boosting the performance of the upgraded system anddegrading the uncoded link.

We emphasize that the specific overlaid communicationscheme considered here is by no means optimal;4 it con-stitutes a particularly interesting example with reasonablygood performance where a nontrivial performance tradeoffis demonstrated. Other interesting options can be treated:for example, when the Wyner–Ziv encoder is replaced bythe Kaspi–Heegard–Berger [16], [17] encoder which operateseither with or without the outputs of uncoded channel(theside-information channel). In this case, the partial cancellation

4For example, a Gaussian-like codebook was assumed which is not neces-sarily optimal in the presence of the non-Gaussian residual interference.

Page 13: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

576 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

of the interference caused by the uncoded transmission can bedone without the assistance of outputs of the uncoded channel.

VI. CONCLUSIONS

It is wasteful to simply discard the existing analog channelin digital broadcast receivers. Through adequate signal pro-cessing and coding/decoding, this paper has established thatif the source and noise are Gaussian and the analog channelis transmitted in Single Sideband, its bandwidth/power can beused as efficiently as if we were able to design a completelydigital system from scratch. We emphasize that the burdento achieve such full bandwidth utilization falls exclusively onthe digital encoder and on the decoder for the analog–digitalreceiver; the existing analog transmitters/receivers are unaf-fected.

The designer can choose how to use the “digital” channelbandwidth. It can improve the fidelity of the output signalrelative to that demodulated by the analog receiver; it can addnew source information bandwidth (for example, additionalscreen lines for high-definition television system,); or it cando both. Clearly, the more ambitious the bandwidth boostingis, the lower the capabilities for increased output signal-to-noise ratio. At one extreme, the designer may want totransmit an information bandwidth which exceeds the channelbandwidth; in this case, the analog transmission is strictlysuboptimum. Assuming Gaussian sources and channels, ifthe information bandwidth to be transmitted is equal to thetotal channel bandwidth (analog plus digital), then we haveshown that not only the existing analog channel incurs noloss of capacity provided it is Single-Sideband modulated,but there is no modulation method for the digital chan-nel—analog or digital—which gives better SNR than SingleSideband. If the desired information bandwidth is less than thetransmitted bandwidth (analog plus digital), we have shownthat sophisticated encoding for the digital channel can renderSSB modulation optimal for the analog channel. We haveestablished the fundamental tradeoff between the output SNRand the transmitted information bandwidth: the product of theinformation bandwidth and its SNR (in decibels) is constantand equal to its value when the reproduced signal bandwidthis equal to the channel bandwidth, i.e.,

where and are the spectral levels of signal and noise, re-spectively. In particular, this formula enables the computationof the SNR achievable when the digital channel is used forSNR enhancement and no bandwidth boosting is required.

The analysis of binary sources has yielded the conclusionthat systematic transmission through either binary-symmetricchannels or Gaussian channels is suboptimal in terms ofresulting bit-error rate. This result is somewhat surprising inview of the almost-noiseless case where it is well known thatnot only systematic codes (either linear or not) do not incurloss in capacity, but they also enjoy the best error exponent atrates between the critical rate and capacity.

We have also introduced a general structure of the optimal(in terms of Theorem 2.1) systematic code for the BSC chan-nel, via the implementation of the Wyner–Ziv rate-distortionfunction in terms of good structured binary codes.

This paper and its predecessor [3] have shown that thefundamental information-theoretic results of Slepian–Wolf [6]and Wyner–Ziv [4] lead to important conclusions in channelcoding with uncoded sided information.

APPENDIX ICONDITIONAL RATE-DISTORTION FUNCTION

Fact: The conditional rate-distortion function (2.11) admitsthe following expression:

(I.1)

Proof: We can expand the optimization set in (I.1) as theset of joint distributions such that if we sum outand the resulting distribution is

(I.2)

and

(I.3)

Under condition (I.3), the last term in the following equationis zero

(I.4)

Thus the right-hand side of (I.1) is greater or equal thanwhere the infimum is over the set of joint dis-

tributions that satisfy the above conditions. Dropping(I.3) can only further lower-bound the resulting expression, atwhich point can be eliminated from consideration since itdoes not affect either the penalty function or the optimizationset. Thus we have established

(I.5)

To show the reverse inequality, we add the following ad-ditional condition to the feasible set of joint distributions

in (I.1):

(I.6)

which implies that , (I.3) is then automaticallysatisfied, and both the penalty function and the feasible setbecome those in the right-hand side of (I.5).

Page 14: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 577

APPENDIX IIWYNER–ZIV RATE DISTORTION OF A BINARY SOURCE

WITH A GAUSSIAN SIDE INFORMATION CHANNEL

In this appendix we calculate the Wyner–Ziv rate distortionfunction (2.2)

(II.1)

of a binary-symmetric source , with respectto the Hamming distortion measure, when the decoder hasaccess to the noisy side information , where

is Gaussian and where is here a scalingfactor which stands for power. Our technique is inspired by thegeneral approach for evaluating and bounding the Wyner–Ziv-rate distortion function recently introduced in [11].

Let be the minimum estimation error defined in(2.1) with the Hamming distortion. Let bea binary random variable, independent of , such that

Define the function in parametricform

(II.2)

(II.3)

where denotes modulo- addition. Since we are dealingwith the bipolar alphabet rather than , itis convenient to adopt the convention that is equivalentto real multiplication, i.e.,(“no error”), while (“error”).Note that as varies from to , the function rangesfrom to , while ranges from to

By symmetry it is easy to verify thatthe maximuma posteriori (MAP) estimator of from is

Thus is the error variable and

(II.4)

is the conditional probability of error given in MAPestimation of from To get an explicit form ofusing (though still in a parametric form), we decomposethe mutual information of the additive channel in (II.2) into adifference of binary entropies, noting thatimplies , to obtain

(II.5)

where is the binary entropy, denotes the binary con-volution operator (Section IV), and the expectation is takenwith respect to

The MAP estimation rule for based on the two measure-ments in (II.3) boils down to

(II.6a)

and the corresponding distortion, i.e., error probability, is

(II.6b)

Next define the function,

(II.7)

The main result of this appendix is summarized as follows.Proposition:

(II.8)

Proof: We first prove the “direct” part of (II.8), i.e.,

(II.9)

From definition (II.7), for each the value is a convexcombination of two points on Thus for each we canfind a binary random variable (“time-sharing variable”),such that the pair is independent of , and

(II.10)

where

(II.11)

and where the second equality in (II.10) follows from the chainrule noticing that Observe that since isindependent of , we have the following Markov chain:

(II.12)

By (II.11) and (II.12) the pair satisfies theconditions of the minimization (II.1), so by (II.10) must

upper-bound , and (II.9) follows.We turn to show the converse part of (II.8), i.e.,

(II.13)

by means of four lemmas.Lemma II.1: is independent of Furthermore,

is independent ofProof: The first part of the lemma is equivalent to

or, by the Bayes rule, to

(II.14)

for every value of and To show (II.14), note thatif then (“error”), and

(“no error”). Similarly, if

Page 15: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

578 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 2, MARCH 1998

then (“error”), and(“no error”). Thus

(II.15)

where follows from the independence of and ;follows from the symmetry of ; and follows likeSimilarly, we have

Thus we can write

(II.16)

Now, from the symmetry of it follows that the conditionaldensity of given is equal to the conditional densityof given , so is independent of This fact,together with the symmetry of implies by the Bayes rule that

(II.17)

Multiplying the left/right sides of (II.16) and (II.17), we obtain(II.14), and the first part of the lemma is proved.

The second part of the lemma follows straightforwardlyfrom the first part and the fact that

Lemma II.2: If is such that form aMarkov chain, then the pair is independent of thepair

Proof: Note thatBy standard manipulations of mutual information, we obtain

(II.18)

where the last equality follows since by Lemma II.1,Now, if and are conditionally

independent given , then the first term in (II.18) is ,implying that the last term is also zero, which implies that

are independent ofLemma II.3: For any

(II.19)

Proof: Suppose that the measurementsare given. Then, any estimator of induces

an estimator of , whose error isThus and

(II.19) follows.Lemma II.4: If is independent of , then

(II.20)

Proof: We have

(II.21a)

(II.21b)

(II.21c)

(II.21d)

(II.21e)

where (II.21a) follows by Lemma II.1 is independentof ; (II.21b) follows by manipulating the condition;(II.21c) follows by the definition of in (II.2) and (II.3);(II.21d) follows from Lemma II.3; and (II.21e) follows fromLemma II.1.

We are now in a position to show (II.13). Let be any ran-dom variable such that form a Markov chain,and Let denote expectation overConsider the following chain of equalities/inequalities:

(II.22a)

(II.22b)

(II.22c)

(II.22d)

(II.22e)

(II.22f)

(II.22g)

(II.22h)

(II.22i)

which are justified by

(II.22a)–(II.22c) standard manipulations and the chain rule,(II.22d) first term in (II.22c) is zero by Lemma II.2, and

last term in (II.22c) is zero by Lemma II.1 (notethat ),

(II.22e) the definition of conditional mutual information,(II.22f) Lemma II.4, noting that by Lemma II.1 is

independent for any value of ( hereplays the role of in Lemma II.4),

(II.22g) Lemma II.3, and the fact that ,(II.22h) Jensen inequality, using the fact that lower-

bounds and is convex ,(II.22i) is monotonically decreasing and

The desired inequality (II.13) now follows from (II.22a)and the definition of in (II.1). Direct evaluation of

in (II.5) along with some standard properties of the binaryentropy function (such as ) yieldthe expression in (5.6).

We call attention to the fact that the function incase the side information is supplied by a BSC (4.2) can alsobe expressed in the form of (II.8) [11].

Page 16: Systematic Lossy Source/channel Coding - Information …ivms.stanford.edu/~dsc/old/infotheory/Shamai_TransInfo... · 2005-10-19 · SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL

SHAMAI et al.: SYSTEMATIC LOSSY SOURCE/CHANNEL CODING 579

ACKNOWLEDGMENT

Discussions with S. Litsyn are gratefully acknowledged.

REFERENCES

[1] P. Elias, “Coding for noisy channels,” inIRE Conv. Rec., vol. 4, Mar.1955, pp. 37–46.

[2] E. M. Gabidulin, “Limits for the decoding error probability when linearcodes are used in memoryless channels,”Probl. Pered. Inform., vol. 3,pp. 55–62, 1967.

[3] S. Shamai (Shitz) and S. Verdu, “Capacity of channels with uncoded sideinformation,” Europ. Trans. Telecommun., vol. 6, no. 5, pp. 587–600,Sept.–Oct. 1995.

[4] A. D. Wyner and J. Ziv, “The rate-distortion function for source codingwith side-information at the decoder,”IEEE Trans. Inform. Theory, vol.IT-22, pp. 1–10, Jan. 1976.

[5] A. D. Wyner “The rate-distortion function for source coding with sideinformation at the decoder—II: General sources,”Inform. Contr., vol.38, pp. 60–80, 1978.

[6] J. D. Slepian and J. K. Wolf, “Noiseless coding of correlated informationsources,IEEE Trans. Inform. Theory, vol. IT-19, pp. 471–480, July 1973.

[7] T. M. Cover and J. A. Thomas,Elements of Information Theory. NewYork: Wiley, 1991.

[8] T. J. Goblick, “Theoretical limitations on the transmission of data fromanalog sources,”IEEE Trans. Inform. Theory, vol. IT-11, pp. 558–567,Oct. 1965.

[9] J. Ziv, “The behavior of analog communication systems,”IEEE Trans.Inform. Theory, vol. IT-16, pp. 587–594, Sept. 1970.

[10] H. L. Van Trees,Detection, Estimation and Modulation Theory, Pt.–II:Nonlinear Modulation Theory. New York: Wiley, 1971.

[11] R. Zamir, “The rate loss in the Wyner–Ziv problem,”IEEE Trans.Inform. Theory, vol. 42, pp. 2073–2084, Nov. 1996.

[12] R. M. Gray, “A new class of lower bounds to information rates ofstationary sources via conditional rate-distortion functions,”IEEE Trans.Inform. Theory, vol. IT-19, pp. 480–489, July 1973.

[13] A. D. Wyner, “Recent results in the Shannon theory,”IEEE Trans.Inform. Theory, vol. IT-20, pp. 2–10, Jan. 1974.

[14] A. J. Viterbi and J. K. Omura,Principles of Digital Communication andCoding. New York: McGraw-Hill, 1979.

[15] A. Lapidoth, “On information rates for mismatched decoders,” inProc.2nd Int. Winter Meet. on Coding and Information Theory(Institut furExperimentelle Mathematik, Universitat GH Essen, Essen, Germany,Dec. 12–15, 1993), p. 26.

[16] A. H. Kaspi, “Rate-distortion function when side-information may bepresent at the decoder,”IEEE Trans. Inform. Theory, vol. 40, pp.2031–2034, Nov. 1994.

[17] C. Heegard and T. Berger, “Rate distortion when side information maybe absent,”IEEE Trans. Inform. Theory, vol. IT-31, pp. 727–734, Nov.1985.

[18] S. Shamai (Shitz), S. Verdu, and R. Zamir, “Digital broadcasting back-compatible with analog broadcasting: Information theoretic limits,” in5th European Space Agency Int. Workshop on Digital Signal ProcessingTechniques Applied to Space Communications(Sitges, Barcelona, Spain,Sept. 25–27), 1996, pp. 8.25–8.39

[19] , “Information theoretic aspects of systematic coding,” inProc.Int. Symp. on Turbo Codes and Related Topics(ENST de Bretagne, Brest,France, Sept. 1997), pp. 40–46.