12
1624 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995 Adaptive Crosstalk Cancellation and Laser Frequency Drift Compensation in Dense WDM Networks M. J. Minardi, Member, IEEE, and M. A. Ingram, Member, IEEE Abstract-lko variations of the LMS algorithm are proposed to cancel received linear crosstalk in dense WDM networks. Analysis and simulations show that with the addition of a few photodetectors, channel spacing requirements can be reduced by over 50 percent. Simulations using a demultiplexer with Gaussian bandpass characteristics show that a 2.5 dB signal-to- crosstalk-plus-noise ratio can be increased to over 35 dB. The decision-directed algorithm will work with OOK data or any intensity modulation scheme which uses the absence of light as one symbol. The decision-directed algorithm makes assumptions on the desired laser frequency so it can accommodate only limited laser drift. A second cancellation algorithm uses pilot tones added to each laser signal in order to cancel the crosstalk. It will work with analog or digital intensity-modulated data and will automatically configure itself to account for laser drift. Both algorithms are blind in that they do not require training sequences for initialization. Analysis shows that weights derived from pilot tones are nearly optimum for canceling crosstalk for the data. Simulations of both algorithms are presented. Finally, both algorithms are shown to be capable of canceling nonlinear beating terms. I. INTRODUCTION LOSE channel spacing in dense WDM networks can C cause linear crosstalk [l] in the receivers due to im- perfect performance of the demultiplexing elements. Most of the research on crosstalk in WDM systems has focused on its characterization, for example in [2]. In some common WDM network topologies, such as the star network, all of the channels are available to the end user. Grating based demultiplexers [3], [4] contain an array of detectors so several channels can be simultaneously received. Other types of de- multiplexers share this property [5]. We propose two adaptive algorithms which use the detector currents and versions of the Least Mean Squares (LMS) algorithm [6] to cancel the linear crosstalk. Post-detection cancellation may offer cost-effective alternatives to stringent demultiplexer and laser frequency specifications, particularly in receivers that already perform some signal processing. Our analysis and simulations show Other authors have treated digital receivers with multichan- nel linear crosstalk. Salz [7] derived an optimal (in the mean squared error sense) receiver filter for a multichannel digital receiver. The signals were assumed to be bipolar amplitude modulated. No adaptive algorithms were proposed. Hoenig et al. applied the LMS algorithm to bundles of twisted wire pairs [8]. Aisawa and Hargreaves [5], [9] demonstrated WDM crosstalk cancelers using neural networks that require training sequences. In this paper we offer two adaptive LMS-based algorithms which do not require training sequences for initialization, that is, they can initialize themselves blindly. The first algorithm, previously reported in [lo], is referred to as decision-directed LMS. It operates on any digital intensity-modulated signal which includes the absence of light (a “zero”) as one of the symbols. It uses a linear constraint and is decision-directed, which means that the flow of the algorithm depends on the symbol decisions of the receiver. The algorithm only updates the filter weights when a zero is detected. The constraint has two functions: it keeps the weights from converging to zero and it ensures that the detector array has a high gain for the laser of the desired channel. The consequences of the constraint and decision-direction is that the algorithm only works with digital data and cannot tolerate significant frequency drift of the desired laser. The second algorithm, the pilot tone LMS algorithm, uses pure sine waves added to the laser drive current, such that each channel has a sine wave (pilot tone) of unique fre- quency. The receiver generates a sine wave with the proper frequency and phase for use as a desired signal with an unconstrained LMS algorithm. This eliminates the need for linear constraint and decision-directed feedback. The pilot tone LMS algorithm works with any mixture of analog or digital intensity-modulated data, and the receiver automatically configures itself to account for laser frequency drift. 11. THEORY that the channel density can be more than doubled with the addition of only a small number of photodetectors. The algorithms also compensate for laser frequency drift. Fig. shows the proposed WDM receiver, Assume that n~ WDM channels are demultiplexed by a grating and then photodetected and amplified. The demultiplexer is imperfect so each direct detection (DD) receiver output is a linear combination of all of the channels plus thermal noise. The nd receiver outputs can be represented by an nd-length vector Z, such that z = GS + v, where s is a n,, vector of optical signal intensities for each channel without crosstalk, v is a nd vector of receiver thermal noise and G is a n, by nd matrix of crosstalk gains where gzj is the gain of the jth wavelength by Manuscript received April 11, 1994; revised March 7, 1975. M. J. Minardi is with the USAF Wright Laboratory, Radar Branch, WL/AARM, Wright-Patterson AFB, OH 45433.7408 USA, and the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, CA 30332-0250 USA. M. A. Ingram is with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, CA 30332-0250 USA. IEEE Log Number 9413050. 0733-8724/95$04.00 0 1995 IEEE

Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

  • Upload
    ma

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

1624 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995

Adaptive Crosstalk Cancellation and Laser Frequency Drift Compensation in Dense WDM Networks

M. J. Minardi, Member, IEEE, and M. A. Ingram, Member, IEEE

Abstract-lko variations of the LMS algorithm are proposed to cancel received linear crosstalk in dense WDM networks. Analysis and simulations show that with the addition of a few photodetectors, channel spacing requirements can be reduced by over 50 percent. Simulations using a demultiplexer with Gaussian bandpass characteristics show that a 2.5 dB signal-to- crosstalk-plus-noise ratio can be increased to over 35 dB. The decision-directed algorithm will work with OOK data or any intensity modulation scheme which uses the absence of light as one symbol. The decision-directed algorithm makes assumptions on the desired laser frequency so it can accommodate only limited laser drift. A second cancellation algorithm uses pilot tones added to each laser signal in order to cancel the crosstalk. It will work with analog or digital intensity-modulated data and will automatically configure itself to account for laser drift. Both algorithms are blind in that they do not require training sequences for initialization. Analysis shows that weights derived from pilot tones are nearly optimum for canceling crosstalk for the data. Simulations of both algorithms are presented. Finally, both algorithms are shown to be capable of canceling nonlinear beating terms.

I. INTRODUCTION

LOSE channel spacing in dense WDM networks can C cause linear crosstalk [ l ] in the receivers due to im- perfect performance of the demultiplexing elements. Most of the research on crosstalk in WDM systems has focused on its characterization, for example in [ 2 ] . In some common WDM network topologies, such as the star network, all of the channels are available to the end user. Grating based demultiplexers [3], [4] contain an array of detectors so several channels can be simultaneously received. Other types of de- multiplexers share this property [5]. We propose two adaptive algorithms which use the detector currents and versions of the Least Mean Squares (LMS) algorithm [6] to cancel the linear crosstalk. Post-detection cancellation may offer cost-effective alternatives to stringent demultiplexer and laser frequency specifications, particularly in receivers that already perform some signal processing. Our analysis and simulations show

Other authors have treated digital receivers with multichan- nel linear crosstalk. Salz [7] derived an optimal (in the mean squared error sense) receiver filter for a multichannel digital receiver. The signals were assumed to be bipolar amplitude modulated. No adaptive algorithms were proposed. Hoenig et al. applied the LMS algorithm to bundles of twisted wire pairs [8]. Aisawa and Hargreaves [5], [9] demonstrated WDM crosstalk cancelers using neural networks that require training sequences.

In this paper we offer two adaptive LMS-based algorithms which do not require training sequences for initialization, that is, they can initialize themselves blindly. The first algorithm, previously reported in [lo], is referred to as decision-directed LMS. It operates on any digital intensity-modulated signal which includes the absence of light (a “zero”) as one of the symbols. It uses a linear constraint and is decision-directed, which means that the flow of the algorithm depends on the symbol decisions of the receiver. The algorithm only updates the filter weights when a zero is detected. The constraint has two functions: it keeps the weights from converging to zero and it ensures that the detector array has a high gain for the laser of the desired channel. The consequences of the constraint and decision-direction is that the algorithm only works with digital data and cannot tolerate significant frequency drift of the desired laser.

The second algorithm, the pilot tone LMS algorithm, uses pure sine waves added to the laser drive current, such that each channel has a sine wave (pilot tone) of unique fre- quency. The receiver generates a sine wave with the proper frequency and phase for use as a desired signal with an unconstrained LMS algorithm. This eliminates the need for linear constraint and decision-directed feedback. The pilot tone LMS algorithm works with any mixture of analog or digital intensity-modulated data, and the receiver automatically configures itself to account for laser frequency drift.

11. THEORY that the channel density can be more than doubled with the addition of only a small number of photodetectors. The algorithms also compensate for laser frequency drift.

Fig. shows the proposed WDM receiver, Assume that n~ WDM channels are demultiplexed by a grating and then photodetected and amplified. The demultiplexer is imperfect so each direct detection (DD) receiver output is a linear combination of all of the channels plus thermal noise. The n d receiver outputs can be represented by an nd-length vector Z, such that z = GS + v, where s is a n,, vector of optical signal intensities for each channel without crosstalk, v is a n d

vector of receiver thermal noise and G is a n, by n d matrix of crosstalk gains where gzj is the gain of the j th wavelength by

Manuscript received April 11, 1994; revised March 7, 1975. M. J. Minardi is with the USAF Wright Laboratory, Radar Branch,

WL/AARM, Wright-Patterson AFB, OH 45433.7408 USA, and the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, CA 30332-0250 USA.

M. A. Ingram is with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, CA 30332-0250 USA.

IEEE Log Number 9413050.

0733-8724/95$04.00 0 1995 IEEE

Page 2: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

MINAKDI AND INGRAM: ADAPTIVE CROSSTALK CANCELLATION AND LASER FREQUENCY DRIFT COMPENSATION 1625

Threshold Device

-~

.... z

(nd wide)

DETECTOR ARRAY

4 FIBER

ONVVVVVVVVVV> GRATING, OR OTHER

DEMULTIPLEXING DEVICE

Fig. 1. Block diagram of LMS based WDM receiver.

the ith DD receiver. The ith row of G describes how each laser couples into detector “2” and the j t h column of G describes how the light from laser “ j ” is distributed across the detector array. Gd, the dth column of G, is especially important because it represents the light distribution of the desired signal. G, is a n, - 1 by n d matrix consisting of the other columns corresponding to the undesired channels. Similarly S can be split into the desired signal S d and the vector of the undesired signals S,. Using this notation Z = GdSd + G,S, + v which clearly shows the three components of the detector currents: desired signal, undesired signals and detector noise. The n d

receiver outputs are used in the LMS algorithm as seen below. The receiver outputs are weighted and summed to produce

the overall system output Sout = WTZ. The goal is to find a W which will minimize the bit error rate which is usually assumed to be the same as maximizing the SCNR, the ratio of desired signal energy to crosstalk and noise. Before proceeding, this performance criteria will be defined. Given W, three important quantities can be calculated. These are the expected value of the output noise power

pn = ((WTVl2) = WT(VVT)W = WTa21W

= WTWa2 (1)

the expected value of the output crosstalk power

P, =((WTG,S,)2) = w~G,(s,s:)GTw = W~G,R,,G:W ( 2 )

and the expected value of the desired channel signal power

pd = (S:)W~G,G:W (3)

where (.) denotes expectation, R,, is (S,ST) and o2 is the noise power. SCNR is created by taking ratios of the three quantities defined above

SCNR = Pd/(P, + P,)

(4) - (S2)WTGdGTW -

WT(G,R,,GT + a21)W’

An optimum weight is desired which will maximize (4). The numerator of (4) can be forced to be a constant without loss of generality. Then the question can be restated as: minimize the denominator of (4) subject to a linear constraint GZW = 1. Geometrically the constraint means that W is forced to lie in a 7 ~ d - 1 dimensioned hyperplane perpendicular to the constraint vector Gd and located a distance 1 from the origin.

Using Lagrange multipliers [ 1 I ] the solution is

WSCNR = [G:(G,R,,Gy + o ~ I ) - ’ G ~ ] - ~ . (G,R,,G: + a21)-lGd

= k(G,R,,GF + a’I)-lGd, ( 5 )

where IC is a scalar constant. The exact value of IC is unim- portant because SCNR is not changed if W varies by a scale factor. SCNR is difficult to work with because it is a ratio. An alternate cost function that is commonly used is the mean square error (MSE) which is defined as the variance of the difference between the filter output and the desired crosstalk- and noise-free output signal given by

J = ( e 2 )

= ((So,, - S d 2 ) =((WTZ - S d ) 2 )

= w ~ ( G ( s s ~ ) G ~ + (w’))w = ((W’(GS + V) - S,j)2)

- 2WT(G(SSd) + (vsd)) + (Sf) = WT(GR,,GT + a21)W - 2WTGR,,j + (Si) (6)

where R,,5 is ( S S T ) , and R,d is (SSd), the expected value of the desired signal multiplied with the signals from each channel. It is important to note that R,d is equal to a scale multiple of the dth column of R,,. The weights which minimize J are found by solving the Wiener-Hopf equation

wMSE = (GR, ,G~ + a 2 ~ ) - 1 ~ ~ , d

= R;; R , ~ (7)

where R,, = (GR,,GT + a21) and Rrd = GR,d. If R,, is diagonal (which equivalent to saying that the signals are zero mean and uncorrelated from each other) then it has been shown [ 121 that WMSE and W ~ C N R differ by only a constant and hence produce the same SCNR. This condition is true for direct detection optical receivers with a.c. coupling (this would have the effect of turning a sequence of zeroes and ones into a sequence of 5;’s).

The standard LMS algorithm can be used to adaptively converge to W ~ ~ S E . LMS is based on the gradient search al- gorithm, sometimes known as the method of steepest descent. The negative gradient of the squared error, -VW( (e’)), is the direction in W-space which gives the greatest improvement

Page 3: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

1626 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995

in (e2) (it points along a path toward WMSE). By taking a small step in this direction at each iteration the weights can converge to wMSE. In general, Vw((e2)) is not known; however, if (e’) is replaced by the instantaneous error energy e2 = (Sout - Sd)2 then

Vw(e’) = 2 e ~ w ( e )

= 2eVw(WTZ - Sd) = 2eZ. (8)

assume that each element of S is an OOK signal with equal probability of zeroes and ones, that all channels have equal power, that the bits of each channel are synchronized (worst case) and that all of the signals are independent of each other. The equal power assumption is valid since the factors which account for unequal power can be absorbed into the gain matrix G. Since the algorithm must operate on zeroes and ones the dc component must remain so Re, is nondiagonal. Re, has 1/2 on the diagonal and 1/4 elsewhere. An example of a 4 x 4 R,, is

LMS uses (8) as a noisy estimate of the gradient. This algorithm is simply 112 1/4 1/4 1/4

1/4 1/2 1/4 1/4 1/4 1/4 1/2 1/4 ‘

114 114 1/4 112

Form error output:

Update the weights:

e = wzdz - Sd WneW = Wold - peZ

Repeat

where ,LL is the step size, a small positive real number. The LMS algorithm uses Sout and Z to adaptively drive

w toward the optimum value. s d is not generally known and producing the desired signal is typically the central problem in designing a practical adaptive algorithm. An obvious solution is a training sequence, where the transmitters periodically send known signals which are prestored in the receiver. When the training sequence is received, the signal will be compared to the stored data and the difference is used to drive the LMS algorithm. This method will converge to W ~ ~ S E , however it requires cooperation and synchronization between the transmitter and receiver and at the very least data transmission must be interrupted to train the receiver. For this reason, algorithms are desired which minimize or eliminate the cooperation needed for the adaptive processing. In the sequel, we will refer to the standard LMS algorithm as “training-sequence LMS .”

Both algorithms presented in this paper are derived from training-sequence LMS. We will use WMSE and the SCNR produced by the training-sequence algorithm as benchmarks for measuring the performance of the new algorithms.

111. DECISION-DIRECTED LMS ALGORITHM

The first algorithm is a decision-directed LMS algorithm designed to work with OOK data (or any modulation scheme where the absence of light is one of the symbols) in a dc coupled receiver. It is derived from training sequence LMS by making two modifications. The first is to update the weights only when a zero is received. If the decision circuitry decides that a zero was received then it can be assumed that Sd is zero and that (e2) = (Szut). If S d is zero, then G becomes G, and S becomes S, and

(e’) = ( S L ) =((WTZ)2)

= w~(G,(s,s:)G: + ( V v T ) ) ~

= w~(G,R,,G: + a 2 ~ ) ~ (9)

= ([WT(G,S, + v)I2)

The second modification is the addition of a linear constraint LTW = f where L is the constraint vector and f is a scalar constant. A constraint is necessary for two reasons. The algorithm only updates when Sd = 0, so R,d = 0. This implies that without a constraint the optimum weights are 0 [see (7) with Rsd = 01. Secondly the constraint will guarantee that Sout contains the desired signal Sd.

Again using Lagrange multipliers [ 1 11 the solution is

Wdd = [LT(G,R,,G? f a’I)-lL]-l , (G,R,,G: + ~ I ) - ~ L

= kdd(G,R,,GT + a21)-lL. (1 1)

Comparing (1 1) to ( 5 ) suggests that the best constraint would be L = Gd which would make Wdd equal to WSCNR except for a scale factor and the SCNR would be optimum. We have assumed that such precise knowledge of the light distribution across the detector array is not available because if it was, the optimum weights could be precomputed and the adaptive algorithm would not be needed. For this reason, we use the simple linear constraint edW = 1, where ed is a vector of all zeroes with a one in the dth position. The constraint assumes little about the desired channel light distribution except which detector receives the peak of the desired laser signal.

The SCNR penalty for using Wdd defined in (1 1) instead of W M ~ E is less than one dB for many relevant scenarios. As long as the laser corresponding to the desired channel does not drift far from its assumed location. Our simulations will show that in spite of the SCNR penalty, the SCNR improvements achieved by the decision-directed algorithm are quite significant.

The decision-directed LMS algorithm will adaptively con- verge to the weights in (11). The algorithm is simply

Form new output: Sout = w:,z Decide if Sout is a one or a zero)

If a one then coast wnew = Wold If a zero then update W,,, = Wold - pS,,tZ

and then set Wnew = Wnew/Wnew(d) Repeat

where, again, Re, = (S,ST), the correlation matrix of the crosstalk vector S, and a’ is the thermal noise variance. We

W(d) is the weight of the dth detector. The last step imple- ments the linear constraint edW = 1.

Page 4: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

MINARDI AND INGRAM: ADAPTIVE CROSSTALK CANCELLATION AND LASER FREQUENCY DRIFT COMPENSATION

Data Spectru W C Bias Current

Pilot Tone 2 .I* 1 Pilot Tone A A

Crosstalk Contaminated

Data Spectrum

W C Bias Current

Desired Desired \

Frequency

Fig. 2. Spectrum of detector current with added pilot tone

The decision-directed LMS algorithm has some drawbacks. It requires that the output of each detector be sampled within a single bit interval because the algorithm only runs when a zero is received. Since the bit rate may be in the GHz range, the sampling electronics could be expensive to implement. The nature of the constrained LMS algorithm will not permit the algorithm to find the desired channel if the laser drifts too far from the assumptions built into the linear constraint. Finally the algorithm will work only with a desired channel transmitting an OOK signal (or at least an alphabet with one symbol represented by the absence of light). Its main advantage is that it is a so called “blind” adaptive algorithms that determine the proper weights solely by examining the output of the channel, which relieves the network of the overhead due to training sequences.

IV. PILOT TONE LMS ALGORITHM

Several authors, for example [ 131, have proposed that each transmitter in a WDM network add a sine wave of unique frequency to the laser drive current for each channel. The sine wave frequencies are higher than the bandwidth of the information so they can be isolated and used for channel identification, signal routing and other uses. We propose to use these pilot tones to adaptively cancel the linear crosstalk and compensate for laser frequency drift. Two methods are proposed, an adaptive unconstrained LMS algorithm and a method that uses simple filters to estimate the crosstalk gain matrix G .

The pilot tone algorithm works as follows. Each transmitting laser in the network adds a small sinusoidal oscillation to the laser drive current. The drive current is usually the sum of a bias current and a current containing the information; now a third current term is added which is a sine wave of constant frequency. The frequency must be higher that any spectral components of the information current. The resulting spectrum would look like Fig. 2. Note that the pilot tone is separated from the data spectrum and can be isolated with a band pass filter.

Fig. 3 shows the spectrum when data is contaminated with crosstalk from other channels, each with their own pilot tones of unique frequency. The pilot tones will be subject to essentially the same crosstalk gains (G) as the information. Therefore if weights are found to cancel the crosstalk among the pilot tones the same weights will cancel the crosstalk in the data. For these signals the vector of detector currents is Z = G(S + P) + v where P is a vector of pilot tones.

Fig. 4 shows a schematic diagram of the pilot tone LMS receiver. A bandpass filter isolates the entire field of pilot tones

1 0 A

Pilot Tone \ 1 Pilot Tone

1 0 A

1627

Frequency

Pilot tones from intel-fering Channels

Fig. 3. Spectrum of detector current of desired laser signal with crosstalk from other channels. All lasers have added pilot tones with unique frequencies.

PILOT A

-

LOCAL OSCILLATOR

DETECTOR ARRAY

Fig. 4. Block

Y LMS I-

PASS FILTER

/ diagram of the pilot tone LMS receiver.

and a lowpass filter isolates the data signals. The pilot tones can then be beat down with an optional local oscillator. An unconstrained LMS algorithm is then run on the field of pilot tones. Unconstrained LMS needs a desired signal, which in this case is a sine wave at the frequency of the pilot tone of the desired channel. The resulting weights which are generated are then applied to the data signals to eliminate the crosstalk.

Using pilot tones to cancel crosstalk offers several advan- tages.

No need for high speed electronics to perform the LMS algorithm. Theoretically the sine waves may be arbitrarily close in frequency so the adaptive filter, whether imple- mented digitally or analog, can run as slowly as desired. In other words the network designer is free to trade off convergence speed and cost. Unconstrained LMS algorithm. Because the signals are now simple sine waves it is easy to synthesize a desired signal; it need only be a sinusoid of fixed amplitude at the proper frequency and close to the correct phase. This allows the simpler unconstrained LMS algorithm to be used. The desired signal can be generated using a variety of methods, most of them using the input signal. Rough knowledge of the desired pilot tone frequency is needed then the pilot tone can be electronically filtered out and used to lock the desired signal to the proper phase and frequency.

Page 5: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

I628 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995

Tracking of laser drift and/or changes in the demulti- plexer. An added benefit of the unconstrained LMS is that the weights are automatically configured to follow a drifting laser or to account for changes in the demul- tiplexer performance (e.g., due to temperature effects). This is because the pilot tones are part of the intensity modulation and therefore are independent of the laser center frequency. In fact the receiver need absolutely no knowledge of the actual laser frequencies; all it needs to know are the pilot tone frequencies assigned to each channel. The constrained LMS does not have this property because some assumptions on the location of the desired-laser energy on the detector array are built into the constraint. Note that if a drifting laser coincides with another laser center frequency no linear combining technique can separate the channels. Pilot tone cancellation will improve the signal to crosstalk ratio no matter what types of data are being transmitted on the lasers. It will work with any mixture of analog or digital intensity-modulated data.

A. Optimum Weights Using Pilot Tone LMS

We will begin by comparing the optimum weights for both algorithms to see how well the optimum pilot tone weights cancel crosstalk when they are applied to the actual data. The equation for the optimal pilot tone weights is the same as the optimal MSE weights in (7) with the signal vector S replaced by the pilot tone vector P

w, = ( G ( P P ~ ) G ~ + ~ I ) - ~ G ( ~ P )

= (GR,,G~ + ~ I ) - ~ G R ~ ~ . (12)

Rpp is a diagonal matrix and Rpd is a multiple of the dth column of R,,, a vector with one nonzero entry in the dth position. R,,s is also diagonal (assuming the dc bias current has been filtered out) so if we assume that the ratio between the power of the pilot tone and the data signal is the same in each channel then Rpp and Rpd differ from and R,s,s and R,sd by only a scale factor. Therefore the only difference between the weights W, and W ~ I S E would be due to differences in the SNR between the two cases. As long as the SNR is high the differences should be negligible.

R,, and R,5,s may differ by more than a scale factor. The pilot tone to data power ratios may not be constant for all channels or the filters which separate the data and pilot tones may not be matched and could alter the power ratios. Since we are forming the weights using Rpp and applying them to the data we are interested in the question of how the weights vary as a function Rss. For this analysis we consider three types of receivers; 1) the same number of detectors as channels (uniquely determined), 2) more detectors than channels (overdetermined), and 3) more channels than detectors (underdetermined). For the first two cases we show that when receiver noise is neglected the pilot tone and training sequence weights for the dth channel are equal to the dth column of the matrix (GT)# where # denotes the matrix pseudo-inverse (if the matrix is invertible the pseudo-inverse is equal to the inverse). When receiver noise cannot be neglected,

the excess MSE produced by using pilot tone weights instead of training sequence weights is proportional to the square of the noise power (the minimum MSE produced by using training sequence weights is proportional to the noise power).

Uniquely Determined: The simplest case is a receiver with the same number of channels and detectors, for example seven channels and seven detectors. This means that G is square. If we neglect thermal noise, then R,, = GR,,GT and R,, has full rank and is invertible. The solution to the Wiener-Hopf equation is then

wRisE = ( G R , , G ~ ) - ~ G R a . (13)

Recall that Rsd is simply a multiple of the nth column of R55. We may solve for the optimum weights for all channels simultaneously; then Rsd a square matrix equal to R,5,A where A is a diagonal matrix representing the fact that each column of Rsd is a scalar multiple of the corresponding column of Rss. The solution to the Wiener-Hopf equation can now be expressed as

wlISE = ( G R , , G ~ ) - G R , , G ~ ( G ~ ) - ~ A = ( G ~ ) - ~ A (14)

where the term GT(GT)-l can be inserted because it is the identity matrix. Therefore each column of (GT)-l is the optimal weight vector for selecting the corresponding channel. The important thing to observe is that the optimal weights are independent of R,, as long as it is nonsingular, which is the case for practical situations where each data channel is uncorrelated. Therefore weights derived from pilot tones will be optimal for all types of WDM networks whether analog, digital or hybrid networks containing any combination of data types. In the rest of the paper the diagonal matrix will be dropped because it only changes W M ~ E by a scale factor.

The case just described is the simplest because all ma- trices are square and invertible. Two other cases are also considered: the overdetermined case, where the number of detectors exceeds the number of WDM channels, and the underdetermined case where the number of detectors is less than the number of channels. The overdetermined case may occur if laser drift is perceived to be a problem. The detectors should be placed more closely in frequency than the WDM channels to make sure that the laser light is received even if the channels drift from their assigned frequencies. For example, the detectors may be spaced at one nm intervals while the channels have a nominal two nm spacing. For this case, the crosstalk matrix G would be “tall” having more rows than columns. The underdetermined case might occur if the WDM network has many channels. It is assumed that the crosstalk problem is caused by the channels nearest in frequency and that the crosstalk from channels far away in frequency can be neglected. In this case, the weights are applied to only the several adjacent channels and the matrix G will be “flat,” having more columns than rows. The derivations for the following results are given in the appendix.

Overdetermined: For this case, we get

Page 6: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

MINARDI AND INGRAM: ADAYTIVE CROSSTALK CANCELLATION AND LASER FREQUENCY DRIFl COMPENSATION

This is the same result as in the invertible case except that the pseudo-inverse replaces the actual inverse. Again note that the optimum weights do not depend on R,s, so the pilot tone weights would be optimum for the data also.

Underdetermined: For this case, W h l s ~ cannot be made independent of Rss. All that can be said is

w ~ I S E P G = (G')# (16)

where PG is the projection operator onto the space spanned by the rows of G. So the optimum weights do vary with R,,, however the projection of the weights into the space spanned by the rows of G is independent of R,,s and is equal to (GT)# . Therefore, if R,, # R,,, W, for the underdetermined case may be suboptimum for the data. However, if the extra channels are so far away in frequency that Lheir crosstalk can be considered negligible then their effects on the optimum weight vector should be very small. Calculation of optimum weights for simulated networks using randomly generated R,, support this idea.

In summary, for the uniquely determined and overdeter- mined cases, (GT)# yields the lowest MSE, regardless of the data statistics, R,,. For the underdetermined receiver, the pilot tone weights generate weights that are nearly optimal.

When noise is reintroduced the optimum weights become a function of R,s,s. Noting that R,, = (GR,,GT + a"), the optimum weights are

(17) W ~ ~ S E = (GT)-l - ( T ~ R ~ : ( G ~ ) - ~ .

W ~ I S E = (G')# - o2R;:(GT)#.

For the overdetermined case, we get

(18)

For the underdetermined case, we can only say

W ~ ~ S E P G = (GT)# - a2R;;(GT)#. (19)

For each case, the matrix of optimum weights, or its projection, consists of an invariant term which does not depend on R,, plus a term which does depend on R,, that is scaled by the noise power.

The final step is to examine A J , the excess MSE produced by using W, instead W ~ ~ S E . If we expand A.J into a power series and take only the first term we get

A J a:Tr ((GGT)-'R,-,'[I - (ap/at)2R,s,sR,<~]2

. ( G G ~ ) - ~ } ? (20)

where the subscripts t and p signify training sequence and pilot tones respectively and TI-(.) is the trace operator (the sum of the elements of the main diagonal of the matrix). Note that the excess MSE is proportional to the square of the noise power. Contrast this to Jmir, which is the MSE due to the optimum weights. If the expansion is taken to two terms (the same accuracy used above) then the minimum error, using optimum WRlsE = R;:~R,~~, is

Jrr,in RZ a:Tr [(GGT)-l] . (21)

We see that Jmin is proportional to the noise power, so for reasonably high SNR, say greater than 20 dB, the excess MSE

~

1629

will be negligible compared to the minimum MSE. The above analysis is not valid for an underdetermined receiver ( G is "flat") however the weights should not be very different if the uncompensated channels contribute very little crosstalk.

B. Direct Estimation of the Gain Matrix G

The analysis in the preceding section shows that (GT)# makes an excellent approximation of the optimum weights in all cases'. This suggests that if G could be directly measured or estimated then the optimum weights could be computed directly. The pilot tone concept allows direct estimation of G, the elements of which can be measured with simple filters. Consider the shape of the spectrum of pilot tones in Fig. 3. The area of each spectral line is proportional to the amount of light coupled into that detector from a channel. In other words the area of the various pilot tones compose a row of the gain matrix G. Passing the pilot tone spectra from each detector through a filter bank and measuring the outputs will determine the entire matrix G . The filter outputs could be monitored and the weights recomputed whenever G changes. The measurements could be done with either analog or digital filters.

V. NONLINEAR CROSSTALK

Nonlinear crosstalk can come from two sources. One source is nonlinear material effects like carrier induced phase modu- lation (CIP), stimulated Raman scattering (SRS), stimulated Brillouin scattering (SBS) and four photon mixing (FPM). These four effects occur at high power levels [15] which are not reached in many applications and therefore are not treated in this paper. The simulations and analysis herein assume no nonlinearities due to material effects. The other source is the result of the square-law nature of the photodiodes which results in terms representing the product of the different channel light amplitudes; this source is discussed further in the sequel.

A. Squure Law Nonlineurities

Square-law nonlinearities will occur at all power levels. The terms arise when the different channels beat against each other inside the photodiode. The nonlinear terms are modulated by frequencies equal to the difference between the optical frequencies of the two channels (Aw). If A w is large the cross- channel beating terms will not be passed by the electronics in the optical receiver. If we assume that the highest practical bandwidth for electronics is I O GHz, then, if Aw is greater than 20 GHz, the nonlinear terms will be suppressed by the electronic components. At X = 1.55 pm, 20 GHz corresponds to 0.15 nm. This is a very close spacing and many WDM applications require spacing far in excess of this due to the imperfect performance of the optical demultiplexer; in other words, at 0.15 nm the linear crosstalk alone dominates.

Square law nonlinearities should not be a problem in a properly designed WDM network, however a malfunctioning

'The only time they would fail to do so would be if R,, does not have full rank, which should never occur. Even in this unlikely case ( G T ) # would still do an excellent job of eliminating the crosstalk; it would just mean that another weight vector could do a slightly better job.

Page 7: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

1630 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995

laser may drift in frequency toward an adjacent channel. If the drifting laser collides with the desired channel there is nothing that can be done because the linear crosstalk power from the drifting laser will be too large to cancel. However the drifting channel may collide with another undesired channel. In that case the two channels may beat together to create significant nonlinear crosstalk energy.

B. Derivation of Square Law Terms

Assume that a “rogue” laser drifts in frequency until it nearly coincides in frequency with an adjacent laser so that a beating term is passed by the electronics. In order to analyze how this beating term impacts the algorithms we need to determine what the beating term looks like as a function of time and how the beating term is distributed across the detector array. The beating term, Sb, is added to the signal vector S and a vector representing the distribution of the beating term Gb,

becomes a corresponding additional column of the crosstalk matrix G.

The light amplitude for the nth laser channel is

where S,(t) is the modulating signal of the rith channel, w,, is the laser frequency, and cpn(t) is a laser phase noise term which is uncorrelated from channel to channel. The rogue channel amplitude ( a r ) and the “victim” channel (a?,) are now very close in frequency so w, = w, + Aw, where Aw is small enough to be passed by the receiver electronics. The beating term arises when a, and au are multiplied together as follows:

The term with the sum of the frequencies will be eliminated by the electronics since wi. is on the order of rad/s, so

s b = (1/2)S,.(t)S,.(t)cO~ [Aut + A(P(~)] (24)

is appended to the end of vector S. The elements of the crosstalk gain matrix G describe how

the channel intensities are distributed across the detector array. Therefore the amplitude distribution can be described with terms equal to the square root of the elements of G. The beating signal Sb is distributed across the detector array by

In other words, each element of Gb is the geometric mean of the corresponding elements of G,. and G?.. The vectors G, and G, will be nearly identical because their laser wavelengths are nearly identical, differing by only a few hundredths of a nm. Consequently G b will also be nearly the same. For example, given a demultiplexer with a one nm HWHM and an array of five detectors spaced at intervals of two nm, and A, and A, separated by 0.04 nm (Aw = 5.3 GHz) produced

the following detector distributions:

G, = (:6!) 0.9995

Gb = [:!E%) 0.9997 .

0.0035

G, = (E) 1.0000

0.0037

Notice that they are nearly identical so that a single degree of freedom in weight space can do a good job of canceling all three signals. This is fortunate in the case of the pilot tone algorithm because the nonlinear signal may not be present in the pilot tone region of the spectrum so that the LMS algorithm cannot directly cancel it.

The worst case occurs when the laser for a channel two channels away drifts toward the desired channel and collides with an adjacent channel, For example assume that channel 4 is the desired channel and that channel 2 drifts in frequency and collides with channel 3 thus creating a nonlinear beating signal. This would couple the largest amount of beating signal into the detector for the desired channel.

VI. SIMULATION

Both algorithms were tested with a computer simulation. First, a simple model of a grating demultiplexer was created. The demultiplexer has a Gaussian “impulse response” which means that monochromatic light emerging from the fiber forms a spot with a Gaussian intensity profile across the detector array. Changing laser frequency causes the spot to move across the detector array, however the shape of the profile remains constant. The intensity profile does not decay below -50 dB in order to simulate unintentional diffraction from the grating. The profile is normalized so that the peak is one and the half- width-half-maximum (HWHM) is equal to one. HWHM is the distance between the peak of the profile and the -3-dB point on the skirt of the electrical intensity profile, which should be roughly equivalent to the resolution of a grating demultiplexer. Channel spacing and detector array spacing in the simulation is done in units of HWHM. In this way any grating demultiplexer can be simulated if one has knowledge of its measured or specified performance, e.g., a given crosstalk level of -12 dB between channels is simulated by specifying detector and channel spacings of two HWHM because this places a detector at the 0.25 point on the Gaussian curve. The formula for the normalized optical intensity profile is

i ( x ) = exp [ - ( 0 . 5 8 8 7 ~ ) ~ ] + IOp5. (26)

The laser bandwidth and modulation rate are considered to be small relative to the demultiplexer passband so the

Page 8: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

MINARDI AND INGRAM: ADAPTIVE CROSSTALK CANCELLATION AND LASER FREQUENCY DRIFT COMPENSATION 1631

2 5

2

2 1 5 c,

.w I . 2 1 Q) - c, G 0.5

....w # . a:. I

4.w; ",". ::* ... . m m

w*;l.- -d*l,.? - '

.- 'h ... . .

. %:,.:a .:k k. '+.qW 4Wlr.. ':: I * =-'----,-.- mr

%:-i ., --r ()& c .. ..U)'.

channel spectra is not broadened relative to the impulse profile.

In all but one simulation only linear crosstalk is considered; this is reasonable if the channel spacing is much greater than the electronic bandwidths of each IM/DD receiver and the optical powers are not high enough to cause nonlinear effects like stimulated Raman scattering (see previous section). One simulation included square law nonlinearities in order to test its effects on the algorithms. Digital signals are assumed to be OOK and are used in all simulations except for a pilot tone simulation where some of the channels were intensity- modulated analog signals (to test the ability of the algorithm to operate in mixed format networks). Intersymbol interference is not modeled. All signals, including pilot tones, have unit amplitude. The dominant noise source is assumed to be receiver noise which is modeled as a vector of additive zero mean white Gaussian noise. The noise power is equal in all detectors.

The bandpass and lowpass filters shown in Fig. 4 were assumed to be ideal so the pilot tone signals and the data signals are perfectly separated. Finally, the LMS algorithm used an adaptive multidimensional step size and a discrete cosine transformation of the input data to speed convergence 1161, 1171.

Simulated decision-directed LMS filter output is shown in Fig. 5 for the case of 9 cancellation weights and 1.6 HWHM channel and detector separation. The network has 19 channels with channel 10 the desired. The LMS algorithm was initiated

35 h g 30

; 2o

25 z U

3 1s

t l o

B

0

c, - - LL 5 -

at iteration 200. Receiver noise power was set at -50 dB to ensure that the network was crosstalk-limited rather than noise limited. Fig. 5(a) is the filter output and 5(b) is the output SCNR. With no cancellation, SCNR is 1.5 dB. After convergence it is about 31.5 dB. The SCNR using optimum weights from (11) is 32 dB.

Fig. 6 shows the required channel spacing as a function of SCNR. Again, a 19-channel system is modeled with the center channel, 10, as the desired. The curves represent the SCNR performance achieved using optimal MSE weights. The curves are plotted for no cancellation, 3 weights, 5 weights, and 9 weights. Note the large improvement that can be gained by just using two additional detectors to cancel the adjacent channels. Fig. 7 recasts the same data by showing how many channels could be fit in the 200-nm wide 1550-nm low loss fiber window for a given SCNR requirement. For this figure, a 1 nm HWHM is assumed for the demultiplexer. The symbols in Figs. 6 and 7 are simulation results for both the pilot tone and decision-directed algorithms. Note that both the pilot tone and decision-directed LMS converges to within a fraction of a dB of the optimum for all but very close channel spacings.

These figures demonstrate the potential of postdetection crosstalk cancellation to greatly increase the capacity of WDM networks. The required channel spacing to achieve a desired SCNR is reduced by about 30%, 45%, and 60% for 3, 5, and 9 weights, respectively.

Fig. 8 demonstrates the ability of the pilot tone algorithm to work with a mix of digital and analog formats. It shows the

- Optimum SCNR = 32 dB - - - - - - - - - - - _ -

Unweighted SCNR = 1.5 dB 100 200 300 400 500 600 700 SO0

Page 9: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

1632 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995

:) 20 30 40 SO hO 70 SCNR (dB)

Fig. 6. Channel separation required to achieve a given SCNR level for different number of cancellation weights. Curves are lower bounds achieved by optimum weighting, symbols are results achieved by simulation (19 total channels, gaussian passband shape, -50 dB minimum crosstalk level).

I \

IO 20 30 40 50 60 70 SCNR (dB)

Fig. 7. Number of channels which can fit into the 200 nm-wide 1500-nm transmission window as a function of SCNR for different numbers of cancel- lation weights. Curves are upper bounds achieved by optimum weighting, symbols are results achieved by simulation (19 total channels, gaussian passband shape, -50 dB minimum crosstalk level, 1 nm grating resolution).

result of a simulation of the pilot tone LMS with a seven channel network. Channel 4 carries OOK digital data and channel 5 a simple analog signal. Fig. 8(a) and (b) shows the output of the filter (marked "B" in Fig. 4) tuned to channel 4 and 5 , respectively. Detector and channel spacings were 2 HWHM.

Over time the laser frequencies may drift due to tem- perature effects and component aging. In addition the fre- quency response of a grating-based demodulator may vary with temperature. Because the pilot tone LMS algorithm runs unconstrained, no assumptions are needed on which detectors are, or are not receiving the desired signal energy. Generation of a sine wave at the proper frequency will cause the weights to configure to pick up the desired channel no matter where it is. The decision-directed algorithm can accommodate limited laser drift. However, if the desired laser moves so far that another channel is a better match to the constraint vector, the algorithm will lock on to it instead of the desired signal. Fig. 9 shows the response of each algorithm to a drifting laser;

." I I 0 50 100 150 200 250 300

Filter Iteration Number

- I I

(a)

2 1 1

I 0 50 100 150 200 250 300

Filter Iteration Number

-2 '

(b)

Fig. 8. Pilot tone filter convergence examples for a simulated network with seven channels and seven dectors. (a) Filter converging to Digital OOK data on channel 4, (b) converging to analog data on channel 3 . Filter initiates at iteration 100.

cli i i i i t ic l i

t 0.5 I 1.5 2 2.5 3

Amount of Drift of Desired Channel (HWHbI)

4 0 5

Fig. 9. Signal-to-crosstalk-and-noiseIrati0 (SCNR) of output data for a laser drifting in frequency. The solid line is the SCNR due to optimum weights and the symbols are the SCNR produced by simulations of the algorithms. The Network has 7 channels and 7 detectors with a nominal spacing of 2 HWHM. Channel 4 is the desired channel. Note that the pilot tone algorithm can accommodate all drift amounts while the decision directed algorithm fails when channel 4 drifts near or past channel 5 , located at 2 HWHM.

again, a seven channel, two HWHM network was simulated. The desired channel (channel 4) was moved to different center frequencies and each algorithm was tested. Note that the pilot tone algorithm converges to the optimum weights in all cases, while the decision-directed algorithm only succeeds while the desired laser remains between the two neighboring channels. When the offset gets close to 2 HWHM, the decision-directed algorithm fails and converges to channel 3, it also converges to channel three when the desired channel moves halfway between channels 5 and 6. Note that the pilot tone algorithm converges successfully in these cases.

Simulations of the decision-directed and pilot tone LMS algorithms were run for the case of a nonlinear beating term. A seven channel network is simulated with 4 as the desired

Page 10: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

MINARDI AND INGRAM: ADAPTIVE CROSSTALK CANCELLATION AND LASER FREQUENCY DRIFT COMPENSATION 1633

5 0 lrnl 1511 Z lX I ? 5 0 31HI

Iteration number

(a) 0 , I

Lincar . . .

Iteration number

(b)

Fig. 10. Algorithm performance with a nonlinear beating term between channels 2 and 3. Note that both algorithms cancel both the linear and nonlinear interference. Seven channel system separated by 2 HWHM. Receiver noise is at -47 dB.

channel. Channel 2 drifts in frequency and collides with channel 3 thus creating a nonlinear beating signal. For the pilot tone algorithm, the worst case was assumed and there was no nonlinear energy present in the pilot tone spectrum so the algorithm is "blind" to the existence of the beating term. Fig. 10 shows the shows the linear and nonlinear crosstalk power present in the filter output during filter convergence. Fig. 10(a) and (b) show the output for the decision-directed and pilot tone algorithms, respectively. Note both algorithms cancel both types of crosstalk.

VII. CONCLUSION

Grating-based demultiplexers make available an array of channel signals. In this paper we marry these signals with a mature technology, adaptive filters using the LMS algorithm. Our simulations show that this inexpensive signal processing technique in the receiver can more than double the usable ca- pacity of a WDM optical link. Two algorithms were analyzed. The decision-directed algorithm cancels crosstalk between OOK signals without any cooperation of the transmitter. The pilot-tone algorithm requires modification of the transmit signals but is more flexible. It works with any type of intensity modulated data, does not require knowledge of the laser wavelengths and has a greater ability to follow a drifting laser.

APPENDIX

This appendix has three parts. First we examine WMSE as a function of R,, for a noiseless overdetermined re- ceiver and underdetermined receiver. Second we reintroduce receiver noise and show that the MSE weights now contain a perturbation which depends on R,,. Finally we develop an approximation for the excess MSE which is generated when weights developed using pilot tones are applied to the data signals.

Overdetermined: With the receiver noise neglected, the covariance matrix, R,,, becomes GR,,GT. It is singular

and therefore its inverse does not exist so the Wiener-Hopf equation must be solved using the pseudo-inverse [18].

Consider the general linear relation Ax = b, where the objective is to solve for x. If A is singular then there may be no solution. However there always exists an x that is best in the least squares sense, the x such that Ax is as close to b as possible, the x that minimizes the value I IAx - bl I. Although this x may not be unique (there may be an infinite number), the added restriction that the norm of x must be as small as possible will make x unique. The pseudo-inverse, denoted as A#, solves for this x

Note that the pseudo-inverse exists for nonsquare matrices. In the case of an n x m matrix the pseudo-inverse will be rn x n.

The pseudo-inverse is described in many linear algebra texts, some fundamental properties that we need are listed

A# = (AAT)-lA, if AAT is nonsingular A#A = PA, PA is the projection operator onto the space spanned by the rows of A AA# = P x , Px is the projection onto the space spanned by the columns of A (AB)# = B#A#, if and only if the space spanned by the rows of A is the same as the space spanned by the columns of B

If either the row or column space has full rank then the corresponding projection operator is the identity matrix I. If A is flat and has full row rank then AA# = I, if A is tall and has full column rank then A#A = I.

Using pseudo-inverses, the solution to the Wiener-Hopf equation for the overdetermined noise-free case becomes

~ x t s E = (GR, ,G~)#GR, , . ('42)

If R,, is invertible and G has full row rank (true if all lasers are transmitting at different frequencies) then the conditions for Property 4 hold and (GR,,GT)# = (G')#R;;G# so

wslsE = (G~)#R;:G#GR,,. (A3)

WMSE (GT)# . (A41

When G is tall, the quantity G#G = I, so (A3) is rewritten as

This is the same result as in the invertible case except that the pseudo-inverse replaces the actual inverse. Again note that the optimum weights do not depend on R,, so the pilot tone weights would be optimum for the data also.

Underdetermined: In this case, GR,,GT is invertible SO

the Wiener-Hopf solution becomes

wMSE = (GR, ,G~)-~GR,, . (-45)

Property 4 does not hold in this case but we can still postmultiply both sides of ( A 3 by the term GT(GT)#. However when G is flat this is not I but PG, the projection operator onto the space spanned by the rows of G. The result is

WMSEPG = (cT)#. (A6)

Page 11: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

1634 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 13, NO. 8, AUGUST 1995

So the optimum weights do vary with R,,, however the projection of the weights into the space spanned by the rows of G is independent of R,, and is equal to (GT)#.

We will now reintroduce receiver noise. The first step is to find an expression for R;:, the inverse of the covariance matrix of the detector currents. With receiver thermal noise added, the covariance matrix has the form R,, = R + a21 where R = GR,,GT. R is symmetric (R = RT) and positive semi-definite (WTRW 2 0 for all W not equal to 0) so all of its eigenvalues are greater than or equal to zero and all of the eigenvectors are mutually orthogonal. Additionally, for the underdetermined and uniquely determined networks, R is nonsingular and has a complete set of strictly positive eigenvalues. R can then be diagonalized by an orthonormal matrix Q (all columns are mutually orthogonal with unit magnitude and Q-l = QT)

R = QAQT. (A71

The columns of Q are the eigenvectors of R and A is a diagonal matrix with the diagonal elements equal to the corresponding eigenvalues. R,, is now

R,, = & ( A + cr21)QT. (A8)

We now see that R,, has the same eigenvectors as R and its eigenvalues have increased by the amount a'. So receiver noise does not change the eigenvectors, only the eigenvalues. The inverse of R,, is

R;: = & ( A + u ' I ) - ~ Q ~ (-49)

where the ith diagonal element of (A+a21)-l is l/(& +a2). Now if all of the eigenvalues of R are greater than a2, the diagonal terms can be expanded into a power series

Using these expansions, R;: can also be expanded as

R-l = R-l - a2R-2 + a4R-3 - a6R-4 + . . . Z Z

M

n=O

= ( I - ~R;:)R-!

(A1 la)

(A1 lb)

For the overdetermined case some of the eigenvalues of R are zero. The spaced spanned by R,, can be broken into two subspaces, the first subspace is spanned by the columns of R and is called the signal subspace; it is the same space that is spanned by the columns of G. The second subspace is the null space of R and is called the noise subspace. The signal subspace is spanned by the eigenvectors associated with the nonzero eigenvalues of R, they are also eigenvectors of R,, with eigenvalues X i +a2. The noise subspace is spanned by the remaining eigenvectors of R,, associated with the eigenvalues of 1/02. Equation (A9) can be rewritten as

('412)

where the columns of Q, are the eigenvectors in the signal subspace, and the columns of Qn are the eigenvectors in the

-1 T RL: = Q n Q z / a 2 + QsA, Q,

noise subspace. The matrix A, is square and diagonal and the ith diagonal element is the eigenvalue A, + a2.

The second term of (A12) can be expanded using the same power series as in the previous analysis producing

00

R;: = Q,Qz/a2 + ( -1)na2"R#("+1) n = O

= QnQz/a2 + R# - a2R#2 + a4R#3 - a6R#4 + . . . (A 13)

which is similar to the result in (A1 la) except for an additional term due to the noise subspace.

With R,-L' determined, we are now prepared to return to the Wiener-Hopf equation and look at the optimum weights with noise added. For the uniquely determined receiver we get

WMSE = R;; GR,, = R;;GR,,G~(G~)-~ =( I - a 2 ~ , - , 1 ) ~ - 1 ~ ( ~ T ) - 1 = (cT)-l - a 2 ~ ; ; ( ~ T ) - 1 ('414)

and for the overdetermined case we get

wMSE = ( Q ~ Q T ) G R ~ ~ / ~ ~ + (G')# - 2~;: ( G')#

= (GT)# - a'R;:(G')#. ('415)

The first term in (A15) is zero because QnQT is the projection operator into the noise subspace which, by definition, is orthogonal to all of the columns of G so (QnQz)G = 0. For the underdetermined case, we can postmultiply WMSE by G'(G')# = PG to give

wMSEPG = R;:GR,,G~(G~)# = (GT)# - a2R;:(GT)#. (A16)

For each case, the optimum weights or its projection consists of an invariant term which does not depend on R,, plus a term which does depend on R,, that is scaled by the noise power.

The final step is to examine the excess MSE produced by using W, instead W M ~ E . To do so we need AW, the difference between W, and WMSE

AW = W, - W M ~ E = (O?RL;~ - o:R;&)(GT)# (A17)

where the subscripts t and p signify training sequence and pilot tones respectively. The excess mean square error, AJ , is the quantity Tr (AWTR,,,AW) [14], where Tr (.) is the trace operator (the sum of the elements of the main diagonal of the matrix)

A J =Tr(AWTW,,tAW) = Tr {G#[C~R;;~ + Q ~ ( R ~ ~ ~ R ~ ~ ~ R ~ ~ ~ ) - ~ - 2a,'a,"R,-,',](GT)#}. ('418)

If we expand R;:t and R;:, using equation (Alla) and take only the first term we get

A J E:;T~{(GG~)-~R;; . [I - (a,/at)2R,sR~.]2(GGT)-1}. (A19)

Page 12: Adaptive crosstalk cancellation and laser frequency drift compensation in dense WDM networks

MINARDI AND INGRAM: ADAPTIVE CROSSTALK CANCELLATION AND LASER FREQUENCY DRIFT COMPENSATION 1635

REFERENCES

[1] A. M. Hill and D. B. Payne, “Linear crosstalk in wavelength-division- multiplexed optical fiber transmission systems,” J. Lightwave Technol., vol. 3, no. 3, pp. 643-651, June 1985.

[2] P. A. Humblet and W. M. Hamby, “Crosstalk analysis and filter optimization of single- and double-cavity Fabry-Perot filters,” IEEE J. Select. Areas Commun., vol. 8, no. 6, pp. 1095-1107, Aug. 1990.

[3] J. B. D. Soole, H. P. Scherer etal., “Spectrometer on chip: A monolithic WDM component,” in Opt. Fiber Commun. Con$ Tech. Dig., Feb. 1992, p. 123.

[4] C. Cremer et al., “Grating spectrograph in InGaAsPhP for dense wavelength division multiplexing,” Appl. Phys. Lett., vol. 59, no. 6, pp. 627429, Aug. 5, 1991.

[SI S. Aisawa et al., “Neural-processing-type optical WDM demultiplexer,” J . Lightwave Technol., vol. 11, no. 12, pp. 2130-2139, Dec. 1993.

[6] B. Widrow et al., “Stationary and nonstationary learning characteristics of the LMS adaptive filter,” Proc. IEEE, vol. 64, pp. 1 15 1-1 162, Aug. 1976.

[7] J. Salz, “Digital transmission over cross-coupled linear channels,” AT&T Tech. J., vol. 64, no. 6, pp. 1147-1159, July-Aug. 1985.

[8] M. L. Hoenig, K. Steiglitz, and B. Gonipath, “Multichannel signal processing for data communications in the presence of crosstalk,” IEEE Trans. Commun., vol. 38, no. 4, pp. 551-558, Apr. 1990.

191 D. Hargreaves, P. E. Jessop, and S. Haykin, “Neural network assisted wavelength demultiplexer for fiber optic communications,” in Proc. World Congress on Neural Networks, vol. 1, July 1993.

[ lo] M. J. Minardi and M. A. Ingram, “Adaptive crosstalk cancellation in dense wavelength division optical networks,” Electron. Lett., vol. 28, no. 17, pp. 1621-1622, Aug. 13, 1992.

[ I l l 0. L. Frost, “An algorithm for linearly constrained adaptive array processing,” Proc. IEEE, vol. 60, no. 8, pp. 926-935, Aug. 1972.

[12] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980, p. 103.

[13] D. A. Way et al., “Self routing WDM high-capacity SONET ring network,” in Opt. Fiber Commun. Conj Tech. Dig., Feb. 1992, p. 86.

[14] J. R. Treichler, C. R. Johnson, Jr., and M. G. Larimore, Theory and Design of Adaptive Filters.

[IS] A. R. Chraplyvy, “Limitations on lightwave communications imposed by optical-fiber nonlinearities,” J. Lightwave Technol., vol. 8, no. 10, pp. 1548-1557, Oct. 1990.

1161 M. J. Minardi, M. A. Ingram, and J. S. Goldstein, “Transform domain

New York Wiley, 1987, p. 65.

techniques for adaptive crosstalk cancellation in dense wavelength di- vision multiplexing optical networks,” in Proc. I993 Military Commun. Cont (MILCOM ’93), Oct. 11-14, 1993, vol. 1, pp. 101-105.

[17] J. S. Goldstein, E. J. Holder, and M. A. Ingram, “Reduced complexity adaptive structures for jam-resistant satellite communications,” in Proc. 1993 Military Commun. Con$ (MILCOM ’93), Oct. 11-14 1993, vol. 3, pp. 1033-1037.

[18] L. L. Scharf, Statistical Signal Processing. Reading MA: Addison- Wesley, 1991, p. 49.

M. J. Minardi (S’89-M92-S’92-M’93) was born in Los Angeles, in 1957 and grew up in Dayton OH. He received the B.E.E. and the B.S. in mathematics from the University of Dayton in 1979 and the M.S.E.E. from Georgia Tech in 1981.

He worked at Georgia Tech as a Research Engineer in the area of electronic countermeasures from 1979 to 1983 and 1989 to 1993. From 1983 to 1988 he worked for GTE Government Systems Division as a Radar Systems Engineer at the ALTAIR deep space tracking radar located at the Kwajalein Missile Range in the Marshall Islands. In 1993 he went to work for the Radar Branch of the USAF Wright Laboratories, at which time he entered the Ph.D. program full time at Georgia Tech under the USAF “Palace Knight” program.

M. A. Ingram (S’84-M’85-S’86-M’89) was born in Jacksonville, FL, on October 29, 1959. She received the B.E.E. and Ph.D. degrees in electrical engineering from the Georgia Institute of Technology in 1983 and 1989, respectively.

She joined the faculty in the school of Electrical and Computer Engineering at Georgia Institute of Technology in 1989. Her research there has been focused in the areas of statistical signal modeling and communication theory. Application areas include optical transmission systems and array signal processing. The array processing projects include low complexity adaptive algorithms for array processors and adaptive cancelation of terrain-scattered interference in radar arrays. Optical transmission research includes analysis of amplifier noise in nonlinear optical devices. From 1983 to 1986 she was a Research Engineer with the Georgia Tech Research Institute, analyzing and designing electronic countermeasures for radar systems.