Abou-Faycal Trott Shamai

Embed Size (px)

Citation preview

  • 7/31/2019 Abou-Faycal Trott Shamai

    1/12

    1290 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 4, MAY 2001

    The Capacity of Discrete-Time MemorylessRayleigh-Fading Channels

    Ibrahim C. Abou-Faycal, Student Member, IEEE, Mitchell D. Trott, Member, IEEE, andShlomo Shamai (Shitz), Fellow, IEEE

    AbstractWe consider transmission over a discrete-timeRayleigh fading channel, in which successive symbols face in-dependent fading, and where neither the transmitter nor thereceiver has channel state information. Subject to an averagepower constraint, we study the capacity-achieving distribution ofthis channel and prove it to be discrete with a finite number ofmass points, one of them located at the origin. We numericallycompute the capacity and the corresponding optimal distributionas a function of the signal-to-noise ratio (SNR). The behavior ofthe channel at low SNR is studied and finally a comparison isdrawn with the ideal additive white Gaussian noise channel.

    Index TermsCapacity, fading channels, memoryless fading,

    Rayleigh fading, time-varying channels.

    I. INTRODUCTION

    CHANNELS exhibiting fading and dispersion are oftenused for communication purposes, especially on a

    wireless medium. Perhaps the best known examples of suchchannels are radio links. The fading may be modeled asRayleigh-distributed whenever there are a large number ofindependent scatterers and no line-of-sight path. One wouldlike to understand how these time-varying channels differfrom time-invariant ones, and perhaps obtain some qualitativeresults, for example, a characterization of the fading conditionsunder which capacity is close to the capacity of a Gaussianchannel.

    In this paper we study the memoryless discrete-timeRayleigh-fading channel

    (1)

    where is the channel input, is the output, and andare independent complex circular Gaussian random variables

    Manuscript received December 29, 1997; revised April 15, 1999. This workwas supported in part by Fares Foundation, the National Science Foundation

    under Grant NCR-9314341, and by the fund for the promotion of research atthe Technion. The material in this paper was presented at the IEEE InternationalSymposium on Information theory (ISIT97), Ulm, Germany, JuneJuly 1997.

    I. C. Abou-Faycal was with the Laboratory for Information and DecisionSystems, Department of Electrical Engineering and Computer Science, Massa-chusetts Institute of Technology, Cambridge, MA 02139 USA. He is now withVANU Inc., Cambridge, MA 02140 USA (e-mail: [email protected]).

    M. D. Trott was with the Massachusetts Institute of Technology (MIT), Cam-bridge, MA.He isnow with ArrayCommInc., SanJose, CA95131 USA(e-mail:[email protected]).

    S. Shamai is with the Department of Electrical Engineering, TechnionIs-rael Institute of Technology, Haifa 32000, Israel (e-mail: [email protected]).

    Communicated by M. L. Honig, Associate Editor for Communications.Publisher Item Identifier S 0018-9448(01)02847-4.

    with mean zero and variance and , respectively. Equiv-alently, the amplitude of the fading coefficient is Rayleigh-distributed and its phase is uniform. The discrete time index isdesigned by and we assume that and are sequencesof independent and identically distributed (i.i.d.) random vari-ables . The input is average power limited: .Neither the transmitter nor the receiver knows the value ofor , but both are assumed to know their statistics exactly.

    Perhaps surprisingly, this basic channel is not as wellunderstood as the discrete-time additive white Gaussian noise

    channel. In a 1969 technical report [1], Richters conjecturedthat the capacity-achieving input distribution for this channel isdiscrete, a rather unexpected result for a continuous-alphabetchannel under an average power constraint. The conjecture wasmotivated by analytical arguments but not rigorously proved.The main result of the present paper is a proof of Richtersconjecture. As is the case for the peak-amplitude-constrainedGaussian channels studied by Smith [2] and later by Shamaiand Bar-David [4], the input distribution that achieves capacityfor the fading channel is discrete with a finite number ofmass points. The same result holds for the th-order Rayleighdiversity channel with independent branches, as proved in Ap-pendix III. Another relevant reference to this study is Telatars

    work [5] on the capacity and error exponent of this channel forinfinite bandwidth.

    The model (1) is appropriate in a number of scenarios.For example, a memoryless fading model is reasonable whena narrow-band signal is hopped rapidly over a large set offrequencies, one symbol per hop. The communication securitygained by fast hopping comes at theprice of decreased capacity;theresults developed here help quantify this tradeoff. The modelis also appropriate for a slowly time-varying channel in whichsuccessive symbols are widely separated in time, as might arisewhen transmitting opportunistically during guard times in apacket-based system, or when using information-bearing pilottones for both channel identification and communication. A

    final motivation for studying this model is to complement theresults that apply when fading information is available to thereceiver only [6], [7] or to both receiver and transmitter [8], [9].

    Let us note, however, that understanding the memorylessfading channel is only a small step toward understanding fadingchannels in their most general form. In particular, symbol-ratesampling of a rapidly varying continuous-time fading channeldoes notlead to the model (1). If the time variation in a contin-uous-time channel is fast enough to cause independent fadingfrom symbol to symbol, there will be significant variationwithin each symbol, and the output bandwidth will be much

    00189448/01$10.00 2001 IEEE

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    2/12

    http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    3/12

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    4/12

    http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    5/12

    1294 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 4, MAY 2001

    Fig. 1. The KuhnTucker condition (8) for , , and.

    As a practical matter, these potential difficulties do not arise.Because mutual information is continuous and strictly concavein the input distribution, the optimal input distribution functionchanges continuously (in the weak* topology) with the SNR .Also, we have found empirically that two mass points are op-timal for low SNR and the required number of mass points in-creases monotonically withSNR. Strong evidence for these con-clusions comes from the KuhnTucker condition (10), which,being necessary andsufficient for optimality, allows us to es-tablish (up to the resolution of the numerical algorithms) that alocal maximum found by a descent method is in fact global.

    To apply the KuhnTucker test to a postulated and , wefirst compute and , then plot the LHS of

    (8) as a function of . The resulting graph must be nonnegativeand must touch zero at the atoms of , as in Fig. 1.The curve and the mass point locations and probabilities

    were computed using a gradient descent method, together withGaussLaguerre quadrature to evaluate the necessary integrals.Projected gradients were used to keep the mass point probabili-ties positive. An alternative optimization method, using a quan-tized version of ArimotoBlahut, was too slow to be useful.

    Low SNR: For low values of the SNR the capacity-achieving input distribution has only two mass points, and,therefore, amounts to onoff keying. One point is alwayslocated at zero; the other is easily found with standardone-dimensional optimization techniques, because the power

    constraint determines the mass point probability as a functionof its location. The KuhnTucker condition verifies that theoptimized two-point distribution achieves capacity. The optimalmass point locations and probabilities are plotted as a functionof in Fig. 2.

    As decreases, the probability of the nonzero mass point ap-proaches zero, while its amplitude increases, albeit quite slowly.That the amplitude increases to infinity as is proved byGallager [10, Theorem 8.6.1].

    Higher SNR: As SNR increases, does a new mass point ap-pear apart from the others, or does an existing mass point splitintwo?Fix[15] considered an analogous problem in rate distor-tion theory, and concluded from plots of a quantity comparable

    to Fig. 1 that both possibilities arise. The question of splittingfor rate distortion theory was also studied (but not solved) byRose [16], whose results can also be shown by techniques sim-ilar to those in [2].

    From a numerical standpoint this question is quite chal-lenging, as mutual information is insensitive to the locationof a new mass point with near-zero probability. Some insight

    can be gained from the KuhnTucker condition, which for ourproblem suggests that mass points never split and that newmass points appear initially at . For our problem, when theSNR reaches a level where a new mass point is needed, the

    asymptote of the quantity plotted in Fig. 1 switchessuddenly from to . We conjecture that the analysisthat applies to the case can be modified to prove that allnew mass points enter at .

    Fig. 3 shows the capacity (in nats per channel use) ofthe i.i.d. Rayleigh-fading channel as a function of the powerconstraint . Figs. 4 and 5 show the locations and probabilitiesof the mass points in the capacity-achieving input distribution.The mass point with lowest probability has the highest ampli-

    tude, the mass point with the second lowest probability has thesecond highest amplitude, and so on. The dashed segments inthe figures are conjectured curves; numerical optimization be-came unstable when the lowest mass point probability droppedbelow .

    Interestingly, the location of a new mass point initially movesdownward as increases, then moves upward. This peculiar be-havior has little engineering consequence, as the probability ofthemass point remains negligible until its location begins its up-ward trend.

    The sensitivity of mutual information to the exact numberand location of the mass points appears to be small. Fig. 6, forexample, shows the maximum mutual information achievablewhen a distribution with two mass points is used in the SNRregion where three or four mass points are optimal. At an SNRof there is only a 4% gap to capacity. Though not shown inthe figure, moving the nonzero mass point 10% from its optimallocation yields a mutual information that is only 0.5% lower.

    Finally, a reference that is worth mentioning is Taricco andElias work [18] which provides bounds that reflect the asymp-totic low and high SNR behavior.

    B. Comparison with the Ideal Gaussian Channel

    In Fig. 7, the ratio of the capacity of the ideal additive whiteGaussian noise channel to the capacity of the i.i.d. Rayleigh-

    fading channel is plotted as a function of the power constraint,where the channels are normalized to have the same SNR atthe receiver. The graph suggests that the capacity of the fadingchannel approaches the capacity of the Gaussian channel as

    .Thisisindeedthecase,asisshownforacontinuous-timemodel in [10, Theorem 8.6.1] and for a discrete-time model in[17, Example 3]. Fig. 7 illustrates, however, that the asymptoteis approached quite slowly.

    At low SNR, ONOFF keying with a low duty cycle is optimal,and a capacity-achieving codebook for the fading channel re-sembles pulse-position modulation. Unlike a Gaussian channel,where energy is spread uniformly over all degrees of freedom,for the fading channel energy becomes more concentrated as

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    6/12

    ABOU-FAYCAL et al.: THE CAPACITY OF DISCRETE-TIME MEMORYLESS RAYLEIGH-FADING CHANNELS 1295

    Fig. 2. Probability and location of the nonzero mass point.

    Fig. 3. Capacity (nats/channel use) versus SNR .

    Fig. 4. Optimal nonzero mass point locations as a function of .

    bandwidth increases. At moderate SNR, the optimal input dis-tribution for the fading channel resembles a uniform distributionover uniformly spaced levels. At high SNR, the loss in perfor-mance due to fading grows rapidly.

    Fig. 5. Corresponding mass point probabilities.

    Fig. 6. Capacity for optimal distribution versus two points distribution (nats).

    C. Comparison with the Fading Channel with Side InformationGiven to the Receiver

    The case when fading information is available to the receiverwas studied by Ericson [6], and later by Ozarow, Shamai, andWyner [7]. The capacity of the channel with perfect channelstate information (CSI) is

    SNR SNR(27)

    http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    7/12

    1296 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 4, MAY 2001

    Fig. 7. The ratio of Gaussian capacity to Rayleigh capacity with no channelstate information (CSI) (solid line), and the ratio of Rayleigh capacity withreceiver CSI to Rayleigh capacity with no CSI (dashed line).

    where

    (28)

    The dashed line in Fig. 7 plots the ratio of the capacity of thei.i.d. Rayleigh-fading channel with perfect CSI to the capacityof the Rayleigh-fading channel with no CSI. The graph suggeststhat the harmful effects of i.i.d. fading arise mainly from theconsequent lack of knowledge of the channel at the receiver, notfrom the time-varying SNR.

    VII. SUMMARY AND DISCUSSION

    Motivated by previous work done by Smith [2] and Shamaiand Bar-David [4], we have proven what Richters [1] con-jectured in his original reportthat the capacity-achievingdistribution for the discrete-time memoryless Rayleigh-fadingchannel is discrete with a finite set of mass points. The mainresults also hold for th-order diversity with independentbranches (see Appendix III).

    An immediate direction for future work is the i.i.d. Riceanfading channel, modeled by giving the fading variable in (1)a nonzero mean. The Ricean model is appropriate when there isa line-of-sight path from transmitter to receiver, or when the re-

    ceiver (andpossibly the transmitter) have side information aboutthe fading. We conjecture that the optimal input will again bediscrete. Unlike the Rayleigh channel, the Ricean channel hastwo parameters: the SNR and the ratio of the fading standarddeviation to the fading mean. The capacity of this channel whenthe input is restricted to be Gaussian has been studied by Zhou,Mei, Xu, and Yao [19]. A block constant fading was assumed,and no side information was available to either the transmitteror receiver. While the result gives some indication for the effectof the direct path, the optimal input distribution might not beGaussian.

    Good, easily computed bounds on the capacity of the i.i.d.Rayleigh-fading channel at low and high SNR would be useful

    for engineering design, as the exact capacity is tedious to com-pute. Some general guidelines on signal set design would alsobe helpful. The sensitivity of mutual information to the inputdistribution function is generally small, and the fading model islikely to be only marginally accurate in many scenarios. Is therea simple rule of thumb for approximately selecting the numberof mass points, their locations, and their probabilities as a func-

    tion of SNR?Complementary to the case where the receiver has CSI is the

    case where the CSI is available at the transmitter only. We ex-pect difficulties here, as transmission strategies rather than inputdistributions should be considered, as is concluded by extrapo-lating the results of Shannon [20] to the continuous state space.

    An important and challenging extension is to generalizethe study to non-i.i.d. Rayleigh-fading channels, modeled forexample by a GaussMarkov process. A classical receiverattempts to track the channel variations when the fading coef-ficients are correlated in time. Should an information-theoreticreceiver do the same? We expect that the answer is effectivelyyes at high SNR with slow fading, and no when the fading

    is fast. The desired results will be difficult to obtain, as theoptimal input process need not be i.i.d. in general. Some workhas been done in this direction by Marzeta and Hochwald [21]who considered an -transmitters and -receivers channel,whose fading is constant for symbols. The capacity of thischannel in some simple cases was computed numerically, andthe conjecture of a discrete optimal distribution was found tobe accurate.

    A most challenging problem is to combine fading memoryand input memory. Underwater acoustic channels, for example,combine rapid time variation with long intersymbol interfer-ence. Little is known about the capacity of such channels or howto achieve it.

    APPENDIX ITHE OPTIMIZATION PROBLEM

    In this appendix, we establish the existence and uniquenessof the capacity-achieving input distribution for the averagepower-limited Rayleigh-fading channel. Existence is automaticfor finite-alphabet channels but not for continuous-alphabetones. Uniqueness follows from a particular parameterizationof the input space that disregards phase. The structure of theexistence proof below follows Smith [2], [3], but the detailsare different; the fading channel has both multiplicative and

    additive noise, and compactness is more difficult to prove foran average power constraint than for a peak power constraint.We establish existence using topological arguments found

    in optimization theory and probability theory. In optimizationtheory, one starts by defining the real normed linear spaceof all bounded continuous functions on . The dual ofincludes the set of all probability measures. Optimization re-sults are then obtained using the weak* topology on [22,Sec. 5.10]. In probability theory, one starts with the set of prob-ability measures, and defines weak convergencewhich is ac-tually the weak* convergence in on this set. Next, a metricthat metrizes weak convergence is defined (e.g., the Lvy metric[23, Sec. III.7]) and optimization is done in the metric topology.

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    8/12

    http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    9/12

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    10/12

    http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    11/12

    http://-/?-http://-/?-
  • 7/31/2019 Abou-Faycal Trott Shamai

    12/12

    ABOU-FAYCAL et al.: THE CAPACITY OF DISCRETE-TIME MEMORYLESS RAYLEIGH-FADING CHANNELS 1301

    [8] A. Goldsmith and P. Varaiya, Capacity of fading channels with channelside information, IEEE Trans. Inform. Theory, vol. 43, pp. 19861992,Nov. 1997.

    [9] H. Viswanathan, Capacity of Markov channels with receiver CSI anddelayed feedback, IEEE Trans. Inform. Theory, vol. 45, pp. 761771,Mar. 1999.

    [10] R. G. Gallager, Information Theory and Reliable Communica-tion. New York: Wiley, 1968.

    [11] I. Csiszr, On an extremum problem of information theory, Studia

    Scient. Math. Hung., no. 9, pp. 5771, 1974.[12] R. M. Gray, Source Coding Theory. Boston, MA: Kluwer, 1990.[13] C. Lamoureux, Analyze Mathmatique et Numrique (Cours de lcole

    Centrale). Paris, France: cole Centrale, 19921993.[14] H. Silverman, Complex Variables. Boston, MA: Houghton Mifflin,

    1975.[15] S. L. Fix, Rate distortion for squared error distortion measures, in

    Proc. 16th Annu. Allerton Conf. Communications, Control, and Com-puters, Oct. 1978, pp. 704711.

    [16] K. Rose, A mapping approach to rate-distortion computation and anal-ysis, IEEE Trans. Inform. Theory, vol. 40, pp. 19391952, Nov. 1994.

    [17] S. Verd, On channel capacity per unit cost, IEEE Trans. Inform.Theory, vol. 36, pp. 10191030, Sept. 1990.

    [18] G. Taricco and M. Elia, Capacity of fading channel with no side infor-mation, Electron. Lett., vol. 33, no. 16, pp. 13681370, July 31, 1997.

    [19] S. Zhou, S. Mei, X. Xu, and Y. Yao, Channel capacity of fast fadingchannels,inIEEE 47th Vehicular Technology Conf. Proc., Phoenix,AZ,May 1997, pp. 421425.

    [20] C. Shannon, A mathematical theory of communication, Bell Syst.Tech. J., vol. XXVII, no. 3, pp. 379423, July 1948.

    [21] T. L. Marzeta and B. M. Hochwald, Capacityof a mobile-antenna com-munication link in a Rayleigh flat-fading, IEEE Trans. Inform. Theory,

    vol. 45, pp. 139157, Jan. 1999.[22] D. G. Luenberger, Optimization by Vector Space Methods. New York:Wiley, 1969.

    [23] A. N. Shiryaev, Probability, 2nd ed. Berlin, Germany: Springer-Verlag, 1996.

    [24] K.-L. Chung,A Course in Probability Theory, 2nd ed. New York:Aca-demic, 1974.

    [25] M. Love, Probability Theory. Berlin, Germany: Springer-Verlag,1977.

    [26] G. Iyengar, Voice channel, in Proc. IEEE Int. Symp. InformationTheory (ISIT97), Ulm, Germany, JuneJuly 1997, p. 332.

    [27] R. S. Kennedy, Fading Dispersive Communication Channels. NewYork: Wiley, 1969.