Upload
muralidhar
View
213
Download
0
Embed Size (px)
Citation preview
Sensor Integration by Joint
PDF Construction using the
Exponential Family
STEVEN KAY, Fellow, IEEE
QUAN DING, Member, IEEE
University of Rhode Island
MURALIDHAR RANGASWAMY, Fellow, IEEE
Air Force Research Laboratory
We investigate the problem of sensor integration to combine
all the available information in a multi-sensor setting from a
statistical standpoint. Specifically, we propose a novel method
of constructing the joint probability density function (pdf) of
the measurements from all the sensors based on the exponential
family and small signal assumption. The constructed pdf only
requires knowledge of the joint pdf under a reference hypothesis
and, hence, is useful in many practical cases. Examples and
simulation results show that our method requires less information
compared with existing methods but attains comparable
detection/classification performance.
Manuscript received January 13, 2011; revised July 10, 2011 and
March 19, 2012; released for publication July 11, 2012.
IEEE Log No. T-AES/49/1/944371.
Refereeing of this contribution was handled by P. Willet.
This work was supported by the Sensors Directorate of the
Air Force Research Laboratory (AFRL/RYRT) under Contract
FA8650-08-D-1303 to Dynetics, Inc.
Authors’ addresses: S. Kay and Q. Ding, Department of Electrical,
Computer, and Biomedical Engineering, University of Rhode
Island, 4 East Alumni Ave., Kingston, RI 02881, E-mail:
([email protected]); M. Rangaswamy, Sensors Directorate, Air Force
Research Laboratory (AFRL/RYRT), WPAFB, OH 45433.
0018-9251/13/$26.00 c° 2013 IEEE
I. INTRODUCTION
Distributed systems and information fusion have
been widely studied and used in engineering, finance,
and scientific research. Such applications are to
radar, sonar, biomedical analysis, stock prediction,
weather forecasting, and chemical, biological,
radiological, and nuclear (CBRN) detection, to
name a few. If the joint probability density function
(pdf) under each candidate hypothesis is known,
we would easily obtain the optimal performance
by the Neyman-Pearson rule (using frequentist
inference) for detection (binary hypothesis testing)
or by the maximum a posteriori probability (MAP)
rule (using Bayesian inference) for both detection
and classification (multiple hypothesis testing)
[1]. However, in practice, this information may
not be available. This usually happens when the
dimensionality of the sample space is high and when
we do not have enough training samples to have an
accurate estimate of the joint pdf. The problem is
exacerbated by onerous environmental and systems
constraints in radar and sonar applications. This is
also recognized as the “curse of dimensionality” in
pattern recognition and machine learning. Hence, it
is important to efficiently approximate the unknown
joint pdf using limited training data. One common
approach is to assume that the measurements from
different sensors are conditionally independent
[2, 3]. This approach has been widely used due to its
simplicity, since the joint pdf is then the product of
the marginal pdfs. This is also known as the “product
rule” in combining classifiers [4]. In spite of its
popularity, the independence assumption may not be a
good one if the measurements are actually correlated.
Furthermore, as stated in [4], the product rule is
severe because “it is sufficient for a single recognition
engine to inhibit a particular interpretation by
outputting a close-to-zero probability for it.” Hence,
researchers have studied other methods that consider
the correlation among the measurements. However,
the problem does not have a unique solution when
the data is non-Gaussian. A copula-based framework
is proposed in [5], [6] to construct the joint pdf. The
exponentially embedded families (EEFs) are used in
[7] to estimate the joint pdf that is asymptotically
closest to the true one in Kullback-Leibler (KL)
divergence.
Note that the above methods all require the
knowledge of marginal pdfs. In this paper we consider
the case when the marginal pdfs are not available or
accurate, which can happen due to a high-dimensional
sample space and insufficient training data. We
present a new way of constructing the joint pdf
without the knowledge of marginal pdfs but only a
reference pdf. In our method this reference pdf is the
pdf under the null hypothesis H0, and we assumethat it is completely known. The constructed joint
580 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
pdf takes the form of the exponential family and
incorporates all the available information. Based on
moment matching the parameters in the constructed
joint pdf are equivalent to the maximum likelihood
estimator (MLE) [8] of the unknown parameters of the
exponential family. Hence, they can be easily found
via convex optimization based on the properties of
the exponential family. Since there is no Gaussian
distribution assumption on the reference pdf, this
method can be very useful when the underlying
distributions are non-Gaussian. We start with the
detection problem, and then we extend our method
to the classification problem. For detection it is shown
that under some conditions, our detection statistics
are the same as the clairvoyant generalized likelihood
ratio test (GLRT). For classification our classifier
also has the same performance as the estimated
MAP classifier. Both the clairvoyant GLRT and the
estimated MAP classifier assume that the true pdfs
under each candidate hypothesis are known except for
the usual unknown parameters.
The paper is organized as follows. In Section II
we introduce a distributed detection/classification
problem. In Section III we construct the joint pdf by
an exponential family and apply it to the problem in
Section II. The KL divergence between the true pdf
and the constructed pdf is examined in Section IV,
and the result shows that within the exponential
family, the constructed pdf is asymptotically the
closest one to the true pdf. Examples for distributed
detection are given in Section V, and examples for
distributed classification are given in Section VI.
Simulation results to compare the performance of
our method with existing methods are shown in
Section VII. In Section VIII we draw conclusions.
II. PROBLEM STATEMENT
Consider the distributed detection/classification
problem when we observe the outputs of two sensors
T1(x) and T2(x), which are transformations of the
underlying samples x. The latter are unobservable at
the central processor as shown in Fig. 1. We choose
two sensors for simplicity. All the results in this
paper are valid for multiple sensors. For detection we
want to distinguish between two hypotheses H0 andH1 based on the outputs of the two sensors, and forclassification, we have M candidate hypotheses Hi fori= 1,2, : : : ,M.
Assume that we have sufficient training data
T(n)1 (x)s and T(n)2 (x)s under H0, i.e., when there is no
signal present. Hence, we have a good estimate of
the joint pdf of T1 and T2 under H0 [9], and thus,we assume pT1,T2 (t1,t2;H0) is completely known.Under H1 for detection or Hi for i= 1,2, : : : ,Mfor classification when a signal is present, we may
not even have enough training data to estimate the
marginal pdfs at each sensor, let alone the joint pdf.
Fig. 1. Distributed detection/classification system with two
sensors.
This is especially the case in the radar scenario, where
the target is present for only a small portion of the
time. So our goal is to use the available information
to construct an appropriate pT1,T2 (t1,t2;H1) under H1for detection or pT1,T2 (t1,t2;Hi) under each Hi forclassification. A simple illustration is shown in Fig. 1.
III. JOINT PDF CONSTRUCTION BY EXPONENTIALFAMILY AND ITS APPLICATION IN DISTRIBUTEDSYSTEMS
First, we consider the detection problem, where we
wish to construct pT1,T2 (t1,t2;H1). The result is thenextended to the classification problem.
To simplify the notation let
T=
·T1
T2
¸so that we can write the joint pdf pT1,T2 (t1,t2;Hi) aspT(t;Hi) for i= 0,1.Since pT(t;H1) cannot be uniquely specified based
on pT(t;H0), we assume that 1) pT(t;H1) is closeto pT(t;H0) (small signal assumption), and 2) theexpected value of T under H1 or E1(T) is known.The reason that we assume small signal is because,
in practice, we are really interested in the small
signal case since for large signals, even a nonoptimal
detector would have acceptable performance. Now, we
want to find a pdf pT(t) such that the KL divergence
D(pT(t)kpT(t;H0)) =ZpT(t) ln
pT(t)
pT(t;H0)dt (1)
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 581
is minimized with the constraint that
Ep(T) = E1(T): (2)
Then the pdf pT(t) is used as our constructed pdf of Tunder H1. Note that the constraint in (2) is consideredas moment matching since the constructed pdf has the
same moment as the true pdf. Here, we consider the
KL divergence because by Stein’s lemma [10], the KL
divergence determines the asymptotic performance
for detection. An extended result to classification
has been recently presented in [11]. Therefore, this
is a worst case approach obtained by minimizing
D(pT(t)kpT(t;H0)).The Kullback theorem in [12] shows that the
solution of the above problem is
pT(t) = exp[μTt¡K(μ) + lnpT(t;H0)] (3)
where
K(μ) = lnE0[exp(μTT)] = ln
Zt
exp(μTT)pT(t;H0)dt
(4)
is the cumulant generating function of pT(t;H0), and itnormalizes the pdf to integrate to one. Here, μ are notfree parameters, and it has to satisfy the constraint in
(2). It can be easily shown that
@K(μ)
@μ= Ep(T) (5)
where
@K(μ)
@μ=
"@K(μ)
@μ1
@K(μ)
@μ2: : :@K(μ)
@μp
#Twith p being the length of μ. Therefore, the constraintin (2) is equivalent to
@K(μ)
@μ= E1(T): (6)
Note that in practice, we may not know E1(T), andtherefore, we use t as an estimate of E1(T). Finally,the constructed pdf under H1 is given by (3), where μsatisfies
@K(μ)
@μ= t: (7)
Now suppose we have an exponential family
parameterized by μ as
pT(t;μ) = exp[μTt¡K(μ) + lnpT(t;H0)] (8)
which is the same as in (3) except that μ are freeparameters. It is also shown in [13] that families of
pdfs with small statistical curvature enjoy, nearly,
the good statistical properties of the EEF in (8).
Since K(μ) is convex by Holder’s inequality [14], theMLE of μ can be obtained by taking the derivativeof μTt¡K(μ) with respect to μ and setting it to zero.This results in
t¡ @K(μ)@μ
= 0 (9)
which is identical to (7). Assume that K(μ) is strictlyconvex and that the solution is unique. Therefore,
this shows that the MLE produces the same μ asmoment matching does in (7). Hence, we can write
the constructed pdf pT(t) in (3) as
pT(t) = pT(t; μ) = exp[μTt¡K(μ) + lnpT(t;H0)]
(10)
where μ is the MLE of μ. Also, note that sinceK(μ) is convex, finding the MLE becomes a convexoptimization problem, and many existing methods can
be readily utilized [15, 16]. If μ = 0, then the pdf in(8) becomes pT(t;H0) or
pT(t;0) = pT(t;H0): (11)
This shows that pT(t;H0) also belongs to theexponential family in (8).
This constructed pdf in (10) looks similar to the
Edgeworth expansion (see [17, eqs. (2.1) and (2.17)]).
The Edgeworth expansion is an approximation of the
cumulative distribution function (CDF) of the sum of
independent and identically distributed (IID) samples
starting from the CDF of the standard Gaussian
distribution. Note that the pdf in (8) belongs to the
exponential family. Since T is a sufficient statistic for
the exponential pdf in (8), this pdf incorporates all
the information from the two sensors. Since pT(t;H0)is required to construct pT(t; μ), it is assumed thatpT(t;H0) is available or that it can be estimated withreasonable accuracy. Also, note that if T1, T2 arestatistically dependent under H0, they will also bedependent for the constructed pdf under H1.With the small signal assumption, pT(t;H1) is
close to pT(t;H0). Under this constraint it has beenshown in [18] that if t is the score function, i.e.,
t=@ lnpT(t;μ)
@μ
¯μ=0
(12)
and E0(T) = 0, by using a first-order Taylor expansionon the log-likelihood function lnpT(t;μ) aboutμ = 0, we obtain the same pdf as in (8). This smallsignal analysis is similar to the locally optimum
detector (LOD) [19], but there are some fundamental
differences between our method and the LOD.
1) Our method produces a pdf, not only a test
statistic.
2) t need not be a score function for theconstructed pdf to be a valid pdf. If it were, the
constructed pdf in (8) could be interpreted as a
first-order Taylor expansion and normalized to be a
pdf.
Finally, since (10) is the constructed pdf under H1,we decide H1 if
lnpT(t; μ)
pT(t;H0)= μTt¡K(μ)> ¿ (13)
582 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
where ¿ is a threshold. Note that Kullback also had
similar ideas (see [12, ch. 5]) where μTt¡K(μ) can beconsidered as the estimated KL divergence between
pT(t;μ) and pT(t;H0)).This method can be extended to the classification
problem. Similar to (10), as shown in [20], we can
construct the pdf of T under Hi as
pT(t;μi) = pT(t; μi) = exp[μTi t¡K(μi) + lnpT(t;H0)]
(14)where
K(μi) = lnE0[exp(μTi T)] (15)
is the cumulant generating function of pT(t;H0) thatnormalizes the constructed pdf. pT(t; μi) is considered
as our estimate of pT(t j Hi), where μi is the MLE ofμi. Hence, similar to the MAP rule [1], we decide Hi,for which the following is maximum over i
p(Hi j t)pT(t) = pT(t j Hi)p(Hi) = pT(t; μi)p(Hi):(16)
When we assume that the prior probabilities of each
candidate hypothesis are equal, i.e., p(H1) = ¢ ¢ ¢=p(HM) = 1=M, p(Hi) cancels, and we can equivalentlydecide Hi, for which the following is maximum over i
lnpT(t; μi)
pT(t;H0)= μTi t¡K(μi): (17)
IV. KL DIVERGENCE BETWEEN THE TRUE PDF ANDTHE CONSTRUCTED PDF
The KL divergence is a nonsymmetric measure of
difference between two pdfs. For two pdfs p1 and p0,
it is defined as
D(p1kp0) =Zp1(x) ln
p1(x)
p0(x)dx:
It is well known that D(p1kp0)¸ 0 with equalityif and only if p1 = p0 almost everywhere [12]. As
we mentioned in Section III, the KL divergence
determines the asymptotic performance for both
detection and classification. Next we show that under
H0, pT(t; μ) = pT(t;H0) asymptotically, and similarlyunder H1, within the family of pdfs in (8), pT(t; μ) isasymptotically the closest one to the true pdf in KL
divergence. Similar results and arguments have been
shown in [7], [21].
Assume that we observe IID T(n)s with
T(n) =
"T(n)1
T(n)2
#for n= 1,2, : : : ,L. Shortening the notation we write
pT(1),T(2),:::,T(L) (t(1),t(2), : : : , t(L); μ) as p(t(1), t(2), : : : ,t(L); μ).
The constructed pdf can be easily extended to
(see (10))
p(t(1), t(2), : : : , t(L); μ)
= exp
"μT
LXn=1
t(n)¡LK(μ) + lnp(t(1), t(2), : : : , t(L);H0)
#(18)
where the MLE μ is obtained by solving
1
L
LXn=1
t(n) =@K(μ)
@μ: (19)
Now we consider two cases. First, for the true pdf
under H0, by the law of large numbers, it follows that
1
L
LXn=1
t(n)!PE0(T)
as L!1. Note that@K(μ)
@μ
¯μ=0
= E0(T):
Since the solution of (19) is unique, asymptotically we
have
μ = 0
and hence, p(t(1),t(2), : : : , t(L); μ) =p(t(1),t(2), : : : , t(L);H0).Second, for the true pdf under H1, by the law of
large numbers, it follows that
1
L
LXn=1
t(n)!PE1(T)
as L!1. Therefore, the MLE μ asymptoticallymaximizes
μTE1(T)¡K(μ): (20)
We denote the underlying true pdf under H1as p(t(1),t(2), : : : ,t(L);H1) and the pdf within theexponential family in (8) as p(t(1),t(2), : : : ,t(L);μ).Since, from (18)
lnp(t(1), t(2), : : : , t(L);H1)
p(t(1), t(2), : : : , t(L);μ)
=¡ÃμT
LXn=1
t(n)¡LK(μ)!+ ln
p(t(1), t(2), : : : , t(L);H1)
p(t(1), t(2), : : : , t(L);H0)
the KL divergence between the true pdf and the one in
the exponential family is
D(p(t(1), t(2), : : : , t(L);H1)kp(t(1), t(2), : : : , t(L);μ))
= E1
"¡ÃμT
LXn=1
t(n)¡LK(μ)!+ ln
p(t(1), t(2), : : : , t(L);H1)
p(t(1), t(2), : : : , t(L);H0)
#=¡L[μTE1(T)¡K(μ)]
+D(p(t(1), t(2), : : : , t(L);H1)kp(t(1), t(2), : : : , t(L);H0)): (21)
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 583
Since D(p(t(1),t(2), : : : ,t(L);H1)kp(t(1),t(2), : : : , t(L);H0))is fixed, D(p(t(1), t(2), : : : ,t(L);H1)kp(t(1),t(2), : : : ,t(L);μ))is minimized by maximizing (20). This shows that
given the exponential family in (8), p(t(1),t(2), : : : ,
t(L); μ) is asymptotically the closest to p(t(1), t(2), : : : ,t(L);H1) in KL divergence.V. EXAMPLES-DISTRIBUTED DETECTION
In this section we compare our method with the
clairvoyant GLRT for a specific detection problem.
Since the true pdf does not necessarily belong to the
exponential family in (8), for the clairvoyant GLRT,
assume that we know that the true pdf of T under H1belongs to a family of pdfs parameterized by some
unknown parameters ®. Note that the μ parametersin our method are in the constructed exponential pdf,
while ® are parameters in the true pdf. Therefore, theclairvoyant GLRT provides an upper bound on GLRT
performance, and it decides H1 if
lnpT(t;®)
pT(t;H0)> ¿: (22)
A. Partially Observed Linear Model with GaussianNoise
Suppose we have the linear model with
x=H®+w (23)with
H0 : ®= 0
H1 : ® 6= 0where x is an N £ 1 vector of the underlyingunobservable samples, H is an N £p observationmatrix with full column rank, ® is a p£ 1 vector ofthe unknown signal amplitudes, and w is an N £ 1vector of white Gaussian noise samples with known
variance ¾2. We observe two sensor outputs
T1(x) =HT1x
T2(x) =HT2x
(24)
where H1 is N £ q1 and H2 is N £ q2. Note that[H1 H2] does not have to be H. This model is calledthe partially observed linear model.
Let G= [H1 H2]. We assume that G has full
column rank so that there is no perfectly redundant
measurements of the sensors. Then we have
T=
·T1(x)
T2(x)
¸=
·HT1x
HT2x
¸=GTx: (25)
Thus, T is also Gaussian, and
T»N (0,¾2GTG) under H0:Let q= q1 + q2, and we can see that T is q£1. As aresult we construct the pdf as in (8), with
K(μ) = lnE0[exp(μTT)] = 1
2¾2μTGTGμ: (26)
Hence, the constructed pdf is
pT(t;μ) = exp[μTt¡K(μ) + lnpT(t;H0)]
=1
(2¼¾2)q=2 det1=2(GTG)exp
μ¡ t
T(GTG)¡1t2¾2
¶¢ exp[μTt¡ 1
2¾2μTGTGμ] (27)
which can be simplified to
T»N (¾2GTGμ,¾2GTG) under H1: (28)
Note that μ is the vector of the unknown parametersin the constructed pdf and that it is different from the
truly unknown parameters ®. From (7) and (26) the
MLE of μ satisfies
t=@K(μ)
@μ= ¾2GTGμ:
So
μ =1
¾2(GTG)¡1t
and the test statistic becomes
μTt¡K(μ) = 1
2¾2tT(GTG)¡1t: (29)
Next we consider the clairvoyant GLRT. It is
defined as the GLRT when the true pdf of T under
H1 is known except for the underlying unknownparameters ®. It is considered as the suboptimaltest by plugging the MLE of ® into the true pdfparameterized by ®. Since the constructed pdf maynot be the true pdf, the clairvoyant GLRT requires
more information than our method. From (25) we
know that the true pdf is
T»N (GTH®,¾2GTG) under H1: (30)
Note that (28) is the constructed pdf, while (30) is the
true pdf. For small signal they are equivalent since
their means are close to zero and since they have the
same covariance matrix. This verifies that our method
works for the small signal case. We need to estimate
μ in (28) or ® in (30) to implement the pdf. We writethe true pdf under H1 as pT(t;®). The MLE of ® isfound by maximizing the true pdf given by (30)
lnpT(t;®)
pT(t;0)=¡ 1
2¾2(t¡GTH®)T(GTG)¡1(t¡GTH®)
+1
2¾2tT(GTG)¡1t:
If q· p, i.e., the length of t is less than or equal tothe length of ®, then the MLE ® may not be unique.However, since (t¡GTH®)T(GTG)¡1(t¡GTH®)¸ 0,we can always find ® such that t=GTH®, and hence,(t¡GTH®)T(GTG)¡1(t¡GTH®) = 0. Hence, theclairvoyant GLRT statistic becomes
lnpT(t;®)
pT(t;H0)=
1
2¾2tT(GTG)¡1t (31)
584 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
which is the same as our test statistic (see (29)) when
q· p.If q > p it can be shown that
®= (HTG(GTG)¡1GTH)¡1HTG(GTG)¡1t
and the clairvoyant GLRT statistic becomes
lnpT(t;®)
pT(t;H0)
=tT(GTG)¡1GTH(HTG(GTG)¡1GTH)¡1HTG(GTG)¡1t
2¾2
(32)
while the constructed GLRT statistic is shown in (29).
B. Partially Observed Linear Model with GaussianMixture Noise
The partially observed linear model remains the
same as in the previous subsection except, instead of
assuming that w is white Gaussian noise, we assume
that w has a Gaussian mixture distribution with two
components, i.e.
w» ¼N (0,¾21I)+ (1¡¼)N (0,¾22I) (33)
where ¼, ¾21, and ¾22 are known (0< ¼ < 1). The
following derivation can be easily extended when
w»PLi=1¼iN (0,¾2i I).
Since w has a Gaussian mixture distribution,
T=GTx is also Gaussian mixture distributed, and
T» ¼N (0,¾21GTG) + (1¡¼)N (0,¾22GTG) under H0:So we have
K(μ) = lnE0[exp(μTt)]
= ln(¼e(1=2)¾21μTGTGμ+(1¡¼)e(1=2)¾22μTGTGμ):
(34)Hence the constructed pdf is
pT(t;μ) = exp[μTt¡K(μ) + lnpT(t;H0)]
=
·¼
(2¼¾21)q=2 det1=2(GTG)
exp
μ¡ t
T(GTG)¡1t2¾21
¶+
1¡¼(2¼¾22)
q=2 det1=2(GTG)exp
μ¡ t
T(GTG)¡1t2¾22
¶¸¢ exp(μTt)=(¼e(1=2)¾21μTGTGμ+(1¡¼)e(1=2)¾22μTGTGμ):
(35)
Although this constructed pdf cannot be further
simplified, we can still find the MLE by solving
t=@K(μ)
@μ
=¼e(1=2)¾
21μTGTGμ ¢¾21GTGμ+(1¡¼)e(1=2)¾
22μTGTGμ ¢¾22GTGμ
¼e(1=2)¾21μTGTGμ+(1¡¼)e(1=2)¾22μTGTGμ
:
(36)
Our test statistic is just
μTt¡K(μ)
= μTt¡ ln(¼e(1=2)¾21 μTGTGμ+(1¡¼)e(1=2)¾22 μTGTGμ)(37)
where μ satisfies (36). Although no analytical solution
of the MLE of μ exists, it can be found using convex
optimization techniques [15, 16]. Moreover, an
analytical solution exists when kμk! 0. To see this
we show that
limkμk!0
@K(μ)
@μ:=(¼¾21G
TGμ+(1¡¼)¾22GTGμ) = 1
(38)
where @K(μ)=@μ is shown in (36) and := means
element-by-element division.
To prove (38) we have
limkμk!0
(¼e(1=2)¾21μTGTGμ+(1¡¼)e(1=2)¾22μTGTGμ) = 1
(39)and
limkμk!0
(¼e(1=2)¾21μTGTGμ ¢¾21GTGμ
+(1¡¼)e(1=2)¾22μTGTGμ ¢¾22GTGμ):=¢ (¼¾21GTGμ+(1¡¼)¾22GTGμ)
= 1 (40)
by L’Hospital’s rule. Dividing (40) by (39) and from
(36), (38) is proved. As a result of (36) and (38), the
MLE of μ satisfies
t= ¼¾21GTGμ+(1¡¼)¾22GTGμ
as kμk! 0 and μ can be easily found as
μ =1
¼¾21 + (1¡¼)¾22(GTG)¡1t: (41)
Since
limkμk!0
K(μ)=( 12¼¾21μ
TGTGμ+ 12(1¡¼)¾22μTGTGμ)
= 1
by using L’Hospital’s rule twice, as kμk! 0, our test
statistic becomes (see (37))
μTt¡ ( 12¼¾21 μ
TGTGμ+ 12(1¡¼)¾22 μTGTGμ)
=1
2(¼¾21 + (1¡¼)¾22)tT(GTG)¡1t: (42)
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 585
TABLE I
Comparison of our Test Statistic and the Clairvoyant GLRT
Our Method Clairvoyant GLRT (q· p) Clairvoyant GLRT (q > p)
Gaussian
Noise
tT(GTG)¡1t tT(GTG)¡1t tT(GTG)¡1GTH(HTG(GTG)¡1GTH)¡1HTG(GTG)¡1t
Uncorrelated
Non-Gaussian
Noise
maxμ[μTt¡ ln(¼e(1=2)¾
21μTGTGμ
+(1¡¼)e(1=2)¾22μTGTGμ
)]
tT(GTG)¡1t ¼
(¾21)q=2
exp
·¡ 12(t¡GTH®)T (G
TG)¡1
¾21
(t¡GTH®)¸
+1¡¼(¾22)q=2
exp
·¡ 12(t¡GTH®)T (G
TG)¡1
¾22
(t¡GTH®)¸
Correlated
Non-Gaussian
Noise
maxμ[μTt¡ ln(¼e(1=2)μTGTC1Gμ+(1¡¼)e(1=2)μTGTC2Gμ)]
¡ lnμ
¼
det1=2(C1)exp
h¡ 12tT(GTC
1G)¡1t
i+
1¡¼det1=2(C
2)exp
h¡ 12tT(GTC
2G)¡1t
i¶ max®
·¼
det1=2(GTC1G)exp
h¡ 12(t¡GTH®)T(GTC
1G)¡1(t¡GTH®)
i+
1¡¼det1=2(GTC
2G)exp
h¡ 12(t¡GTH®)T(GTC
2G)¡1(t¡GTH®)
i¸
To find the clairvoyant GLRT statistic, we know
that under H1 the true pdf ispT(t;®) =
¼
(2¼)q=2 det1=2(¾21GTG)
¢ exp·¡12(t¡GTH®)T (G
TG)¡1
¾21(t¡GTH®)
¸+
1¡¼(2¼)q=2 det1=2(¾22G
TG)
¢ exp·¡12(t¡GTH®)T (G
TG)¡1
¾22(t¡GTH®)
¸:
(43)
Note the difference between (35) and (43) since (35)
is the constructed pdf and (43) is the true pdf. The
MLE of ® is found by maximizing (43) over ®.When q· p the MLE of ® may not be unique but
satisfies t=GTH®. As a result pT(t;®) is a constant,and the clairvoyant GLRT statistic becomes
¡ lnpT(t;0):Note that since pT(t;0) is decreasing as t
T(GTG)¡1tincreases, the clairvoyant GLRT statistic becomes
tT(GTG)¡1t (44)
which is the same as our test statistic (with only a
positive scale factor) as kμk! 0 (see (29)). However,
for large signal, our test statistic in (37) is not
equivalent to the clairvoyant GLRT statistic in (44).
This example shows that our method may not offer
the suboptimal performance as the clairvoyant GLRT
does for large signal.
When q > p it can be shown that
®= (HTG(GTG)¡1GTH)¡1HTG(GTG)¡1t
and the clairvoyant GLRT statistic becomes
¼
(¾21)q=2exp
·¡12(t¡GTH®)T (G
TG)¡1
¾21(t¡GTH®)
¸+1¡¼(¾22)
q=2exp
·¡12(t¡GTH®)T (G
TG)¡1
¾22(t¡GTH®)
¸:
(45)
Note that the noise in (33) is uncorrelated but not
independent. We next consider a general case when
the noise can be correlated with a Gaussian mixture
w» ¼N (0,C1)+ (1¡¼)N (0,C2): (46)
It can be shown that similar to (37), our test statistic is
μTt¡ ln(¼e(1=2)μTGTC1Gμ+(1¡¼)e(1=2)μTGTC2Gμ)(47)
and the clairvoyant GLRT statistic is
¡ lnÃ
¼
det1=2(C1)exp
·¡12tT(GTC1G)
¡1t¸
+1¡¼
det1=2(C2)exp
·¡12tT(GTC2G)
¡1t¸!
(48)when q· p.When q > p the MLE of ® is not in closed form,
and hence we write the clairvoyant GLRT statistic as
max®
"¼
det1=2(GTC1G)exp
·¡12(t¡GTH®)T(GTC1G)¡1(t¡GTH®)
¸
+1¡¼
det1=2(GTC2G)exp
·¡12(t¡GTH®)T(GTC2G)¡1(t¡GTH®)
¸#: (49)
586 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
C. Summary
We have considered the partially observed linear
model with both Gaussian and non-Gaussian noise.
Table I compares our test statistic with the clairvoyant
GLRT.
1) In Gaussian noise w»N (0,¾2I). The teststatistics are exactly the same for q· p.2) In uncorrelated non-Gaussian noise, w»
¼N (0,¾21I) + (1¡¼)N (0,¾22I). The test statistics arethe same as μ! 0 for q· p.3) In correlated non-Gaussian noise, w» ¼N (0,C1)
+(1¡¼)N (0,C2). Although we cannot show theequivalence between these two test statistics, we see
in Section VII that their performances appear to be the
same.
VI. EXAMPLES-DISTRIBUTED CLASSIFICATION
In this section we compare our method with the
estimated MAP classifier for some classification
problems. The estimated MAP classifier assumes
that the pdf of T under Hi is known except for someunknown underlying parameters ®i. We assumeequal prior probability of the candidate hypothesis,
i.e., p(H1) = ¢ ¢ ¢= p(HM) = 1=M. So, the estimatedMAP classifier reduces to the estimated maximum
likelihood classifier [1], which finds the MLE of ®iand chooses Hi for which the following is maximumover i
pT(t;®i) (50)
where ®i is the MLE of ®i.
A. Linear Model with Known Variance
Consider the following classification model:
Hi : x= Aisi+w (51)
where si is an N £ 1 known signal vector with thesame length as x and depends upon the class, Ai is theunknown signal amplitude, and w is white Gaussian
noise with known variance ¾2. Assume that instead of
observing x, we can only observe the measurementsof two sensors
T1 =HT1x
T2 =HT2x
(52)
where H1 is N £ q1 and H2 is N £ q2. Here q1 and q2are the length for vectors T1 and T2, respectively. We
can write (52) asT=GTx (53)
by letting
T=
·T1
T2
¸and
G= [H1 H2]
where G is N £ (q1 + q2) with q1 + q2 ·N. Weassume that G has full column rank so that there are
no perfectly redundant measurements of the sensors.
Note that G can be any matrix with full column rank.
Let H0 be the reference hypothesis when there isnoise only, i.e.,
H0 : x=w: (54)
Since x is Gaussian under H0, according to (53), T isalso Gaussian, and
T»N (0,¾2GTG)under H0. We construct the pdf under Hi as in (8)with
K(μi) = lnE0[exp(μTi T)] =
12¾2μTi G
TGμi: (55)
Hence the constructed pdf is
pT(t;μi) = exp[μTi t¡K(μi) + lnpT(t;H0)]
=1
(2¼¾2)(q1+q2)=2 det1=2(GTG)
¢ expμ¡ t
T(GTG)¡1t2¾2
¶¢ exp[μTi t¡ 1
2¾2μTi G
TGμi] (56)
which can be simplified as
T»N (¾2GTGμi,¾2GTG) under Hi: (57)
The next step is to find the MLE of μi. Note thatthe MLE of μi is found by maximizing μ
Ti t¡K(μi)
over μi. If this optimization procedure is carried out
without any constraints, then μi would be the samefor all i. Hence we need some implicit constraints in
finding the MLE. Since μi represents the signal underHi, we should have
μi = AiGTsi = EHi (T) (58)
which is the mean of T under Hi. As a result (57) canbe written as
T»N (¾2AiGTGGTsi,¾2GTG) under Hi: (59)
Thus, instead of finding the MLE of μi by maximizing
μTi t¡K(μi) = μTi t¡ 12¾2μTi G
TGμi (60)
with the constraint in (58), we can find the MLE of Aiin (59) (since si is assumed known) and then plug itinto (58). It can be shown that
Ai =sTi Gt
¾2sTi GGTGGTsi
(61)
and
μi =GTsis
Ti Gt
¾2sTi GGTGGTsi
: (62)
Hence by removing the constant factors, the test
statistic of our classifier for Hi is(sTi Gt)
2
(GTsi)TGTG(GTsi)
(63)
according to (17).
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 587
TABLE II
Comparison of our Test Statistic and the Estimated MAP Classifier
Our Method Estimated MAP
Known ¾2 (sTiGt)2
(GTsi)TGTG(GTsi)
(sTiG(GTG)¡1t)2
(GTsi)T(GTG)¡1(GTsi)
Unknown ¾2 tTC¡1hi(hTiC¡1hi)
¡1hTiC¡1t
tT[C¡1¡C¡1hi(hTi C¡1h)¡1i hTi C¡1]ttTC¡1gi(g
TiC¡1gi)
¡1hTC¡1ttT[C¡1¡C¡1gi(gTi C¡1gi)¡1gTi C¡1]t
Where hi =GTGGTsi, gi =G
Tsi and C=GTG.
Next we consider the estimate MAP classifier. In
this case we assume that we know the true pdf except
for Ai
T»N (AiGTsi,¾2GTG) under Hi: (64)
Note that (64) is the true pdf of T under Hi and that(59) is the constructed pdf. It can be shown that the
MLE of Ai in the true pdf under Hi is
Ai =sTi G(G
TG)¡1tsTi G(G
TG)¡1GTsi: (65)
By removing the constant terms, the test statistic of
the estimated MAP classifier for Hi is(sTi G(G
TG)¡1t)2
(GTsi)T(GTG)¡1(GTsi)
(66)
according to (50). Note that (61) and (65) are different
because (61) is the MLE of Ai under the constructed
pdf and because (65) is the MLE of Ai under the true
pdf. Also note that if GTG is a scaled identity matrix,
the test statistics in (63) and (66) are equivalent, and
hence, our method coincides with the estimated MAP
classifier.
B. Linear Model with Unknown Variance
To extend the above example, we consider the
above linear model with unknown noise variance ¾2.
As we show in (59), the constructed pdf is still
T»N (¾2AiGTGGTsi,¾2GTG) under Hi (67)
except that ¾2 is unknown. Letting Bi = ¾2Ai, we have
T»N (BiGTGGTsi,¾2GTG) under Hi: (68)
Instead of finding the MLEs of Ai and ¾2, we can
equivalently find the MLEs of Bi and ¾2. Let hi =
GTGGTsi, and let C=GTG. It can be shown that
Bi = (hTi C
¡1hi)¡1hTi C
¡1t (69)
and
¾2 =1
p1 +p2(t¡hiBi)TC¡1(t¡hiBi): (70)
By removing the constant factors, it can also be
shown that the test statistic is equivalent to
tTC¡1hi(hTi C
¡1hi)¡1hTi C
¡1ttT[C¡1¡C¡1hi(hTi C¡1h)¡1i hTi C¡1]t
: (71)
Next we consider the estimated MAP classifier.
The true pdf is still
T»N (AiGTsi,¾2GTG) under Hi (72)
but with unknown Ai and ¾2. Let gi =G
Tsi and let
C=GTG. Similar to (69), (70), and (71), it can be
shown that
Ai = (gTi C
¡1gi)¡1gTi C
¡1t (73)
¾2 =1
p1 +p2(t¡ giAi)TC¡1(t¡ giAi) (74)
and that the test statistic of the estimated MAP
classifier is
tTC¡1gi(gTi C
¡1gi)¡1hTC¡1t
tT[C¡1¡C¡1gi(gTi C¡1gi)¡1gTi C¡1]t: (75)
Note that if GTG is a scaled identity matrix, since
hi =GTGgi, the test statistics in (71) and (75) are
equivalent. Hence our method is exactly the same
as the estimated MAP classifier if GTG is a scaled
identity matrix.
C. Summary
We have considered a linear model with both
known and unknown noise variance. Table II
compares our test statistic with the estimated MAP
classifier. If GTG is a scaled identity matrix, then our
method and the estimated MAP classifier are identical.
Note that this is the case when all the columns in
G are orthogonal and have the same power, such as
the demodulation of the M-ary orthogonal signals in
communication theory.
VII. SIMULATIONS
A. Distributed Detection
Since our test statistic coincides with the
clairvoyant GLRT under Gaussian noise for q· pas shown in Section V-A, we only compare the
performances under non-Gaussian noise (both
uncorrelated noise as in (33) and correlated noise as
in (46)). Consider the model where
x[n] = A1 +A2rn+A3 cos(2¼f0n+Á) +w[n]
(76)
588 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
Fig. 2. ROC curves for our method and clairvoyant GLRT with uncorrelated Gaussian mixture noise. (a) False alarm rate between 0
and 1. (b) False alarm rate between 0.0001 and 0.01.
for n= 0,1, : : : ,N ¡ 1, with known damping factorr 2 (0,1) and frequency f0 but unknown amplitudesA1, A2, A3, and phase Á. This is a linear model as in
(23), where
H=
2666641 1 1 0
1 r cos(2¼f0) sin(2¼f0)
......
......
1 rN¡1 cos(2¼f0(N ¡ 1)) sin(2¼f0(N ¡ 1))
377775and ®= [A1 A2 A3 cosÁ ¡A3 sinÁ]T.Let w have an uncorrelated Gaussian mixture
distribution as in (33). For the partially observed
linear model, we observe two sensor outputs as
in (24). We compare the GLRT in (37) with the
clairvoyant GLRT in (44). Note that the MLE
of μ in (37) is found numerically and not by the
asymptotic approximation in (41). In the simulation
we use N = 20, A1 = 2, A2 = 3, A3 = 4, Á= ¼=4,
r = 0:95, f0 = 0:34, ¼ = 0:9, ¾21 = 50, ¾
22 = 500,
and H1 and H2 are the first and third columns in
H, respectively, i.e., H1 = [1 1 : : : 1]T and H2 =
[1 cos(2¼f0) : : :cos(2¼f0(N ¡ 1))]T. Hence, only thedc level is sensed by one sensor, and the in-phase
component of the sinusoid is sensed by the other
sensor. As shown in Fig. 2, the performances are
almost the same (even for very small false alarm rate),
which justifies their equivalence when kμk! 0, as
shown in Section V.
Next, for the same model as in (76), let w have a
correlated Gaussian mixture distribution as in (46).
We compare performances of the GLRT using the
constructed pdf as in (47) and the clairvoyant GLRT
as in (48). We use N = 20, A1 = 3, A2 = 4, A3 = 3,
Á= ¼=7, r = 0:9, f0 = 0:46, ¼ = 0:7, H1 = [1,1, : : : ,1]T,
and H2 = [1,cos(2¼f0), : : : ,cos(2¼f0(N ¡ 1))]T. The
covariance matrices C1, C2 are generated using
C1 =RT1R1, C2 =R
T2R2, where R1, R2 are full rank
N £N matrices. As shown in Fig. 3, the performances
are still very similar, even for small false alarm rate.
B. Distributed Classification
For the model in (51)
Hi : x= Aisi+w
we first consider a case when GTG is approximately
a scaled identity matrix. Let A1 = 0:4, A2 = 1:2,
A3 = 0:9, and
s1(n) = cos(2¼f1n)
s2(n) = cos(2¼f2n)
s3(n) = cos(2¼f3n)
where n= 0,1, : : : ,N ¡ 1 with N = 25, and f1 = 0:14,f2 = 0:34, and f3 = 0:41. Let p(H1) = p(H2) = p(H3) =1=3. Assume that there are two sensors, each with an
observation matrix as follows, respectively
H1 =
"1 cos(2¼f1) ¢ ¢ ¢ cos(2¼f1(N ¡ 1))1 cos(2¼f2) ¢ ¢ ¢ cos(2¼f2(N ¡ 1))
#TH2 = [1 cos(2¼f3) ¢ ¢ ¢ cos(2¼f3(N ¡ 1))]T:
We use (63) and (66) as our test statistics for the
two methods, respectively, when ¾2 is known.
Test statistics in (71) and (75) are used when ¾2 is
unknown. The probabilities of correct classification
are plotted versus ln(1=¾2) in Fig. 4. We see that
our method has the same performance with the
estimated MAP classifier with known or unknown ¾2,
and probabilities of correct classification go to 1 as
¾2! 0.
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 589
Fig. 3. ROC curves for our method and clairvoyant GLRT with correlated Gaussian mixture noise. (a) False alarm rate between 0
and 1. (b) False alarm rate between 0.0001 and 0.01.
Fig. 4. Probability of correct classification for both methods.
Fig. 5. Probability of correct classification for both methods.
Next we consider a case when GTG is not a scaled
identity matrix. Let A1 = 0:5, A2 = 1, A3 = 1, and
s1(n) = cos(2¼f1n) +1
s2(n) = cos(2¼f2n) +0:5
s3(n) = cos(2¼f3n)
where n= 0,1, : : : ,N ¡ 1 with N = 20, and f1 = 0:17,f2 = 0:28, and f3 = 0:45. Let p(H1) = p(H2) = p(H3) =1=3. Assume that there are three sensors (this is an
extension of the two-sensor assumption), each with an
observation matrix as follows, respectively
H1 = [1 1 ¢ ¢ ¢ 1]T
H2 =
·1 cos(2¼f1) ¢ ¢ ¢ cos(2¼f1(N ¡ 1))1 cos(2¼f2) ¢ ¢ ¢ cos(2¼f2(N ¡ 1))
¸TH3 = [1 cos(2¼(f3 +0:02)) ¢ ¢ ¢
cos(2¼(f3 +0:02)(N ¡1))]T:Note that in H3, we set the frequency to f3 +0:02.
This models the case when the knowledge of the
frequency is not accurate. We also see in Fig. 5 that
the performances of both methods are the same with
known or unknown ¾2, and probabilities of correct
classification go to 1 as ¾2! 0.
VIII. CONCLUSIONS
A novel method of constructing the joint pdf of
the measurements from a distributed multiple sensor
systems has been proposed. Only a reference pdf is
needed in the construction. It is shown that moment
matching and MLE are equivalent. The performance
of our method has shown to be as good as the
clairvoyant GLRT and estimated MAP classifier,
respectively, for detection and classification, while less
information is needed for our method.
590 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
REFERENCES
[1] Kay, S.
Fundamentals of Statistical Signal Processing: Detection
Theory.
Upper Saddle River, NJ: Prentice-Hall, 1998.
[2] Thomopoulos, S., Viswanathan, R., and Bougoulias, D.
Optimal distributed decision fusion.
IEEE Transactions on Aerospace and Electronic Systems,
25 (Sept. 1989), 761—765.
[3] Chair, Z. and Varshney, P.
Optimal data fusion in multiple sensor detection systems.
IEEE Transactions on Aerospace and Electronic Systems,
22 (Jan. 1986), 98—101.
[4] Kittler, J., et al.
On combining classifiers.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 20, 3 (Mar. 1998), 226—239.
[5] Sundaresan, A., Varshney, P., and Rao, N.
Distributed detection of a nuclear radioactive source using
fusion of correlated decisions.
Proceedings of the 2007 10th International Conference on
Information Fusion, Quebec, Canada, July 9—12, 2007, pp.
1—7.
[6] Iyengar, S., Varshney, P., and Damarla, T.
A parametric copula based framework for multimodal
signal processing.
Proceedings of the IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP 2009),
Taipei, Taiwan, Apr. 19—24, 2009, pp. 1893—1896.
[7] Kay, S. and Ding, Q.
Exponentially embedded families for multimodal sensor
processing.
Proceedings of the IEEE International Conference on
Acoustics, Speech, and Signal Processing, (ICASSP 2010),
Dallas, TX, Mar. 14—19, 2010, pp. 3770—3773.
[8] Kay, S.
Fundamentals of Statistical Signal Processing: Estimation
Theory.
Upper Saddle River, NJ: Prentice-Hall, 1993.
[9] Kay, S., Nuttall, A., and Baggenstoss, P.
Multidimensional probability density function
approximations for detection, classification, and model
order selection.
IEEE Transactions on Signal Processing, 49 (Oct. 2001),
2240—2252.
[10] Cover, T. and Thomas, J.
Elements of Information Theory (2nd ed.).
Hoboken, NJ: Wiley, 2006.
[11] Westover, M.
Asymptotic geometry of multiple hypothesis testing.
IEEE Transactions on Information Theory, 54, 7 (July
2008), 3327—3329.
[12] Kullback, S.
Information Theory and Statistics (2nd ed.).
New York: Courier Dover Publications, 1997.
[13] Efron, B.
Defining the curvature of a statistical problem (with
applications to second order efficiency).
The Annals of Statistics, 3, 6 (1975), 1189—1242.
[14] Brown, L.
Fundamentals of Statistical Exponential Families.
Beachwood, OH: Institute of Mathematical Statistics,
1986.
[15] Boyd, S. and Vandenberghe, L.
Convex Optimization.
Cambridge, UK: Cambridge University Press, 2004.
[16] Luenberger, D.
Linear and Nonlinear Programming (2nd ed.).
New York: Springer, 2003.
[17] Hall, P.
The Bootstrap and Edgeworth Expansion.
New York: Springer, 1997.
[18] Kay, S., Ding, Q., and Emge, D.
Joint PDF construction for sensor fusion and distributed
detection.
Proceedings of the 2010 13th International Conference
on Information Fusion (FUSION), Edinburgh, UK, July
26—29, 2010, pp. 1—6.
[19] Kassam, S. A.
Signal Detection in Non-Gaussian Noise.
New York: Springer, 1988.
[20] Kay, S., Ding, Q., and Rangaswamy, M.
Sensor integration for classification.
Proceedings of the 2010 Conference Record of the
Forty-Fourth Asilomar Conference on Signals, Systems and
Computers (ASILOMAR), Pacific Grove, CA, Nov. 7—10,
2010, pp. 1658—1661.
[21] Kay, S.
Exponentially embedded families–New approaches to
model order estimation.
IEEE Transactions on Aerospace and Electronic Systems,
41, 1 (Jan. 2005), 333—345.
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 591
Steven Kay (M’XX–F’XX) was born in Newark, NJ on April 5, 1951. Hereceived his B.E. degree from Stevens Institute of Technology, Hoboken, NJ in
1972, his M.S. degree from Columbia University, NY in 1973, and his Ph.D.
degree from the Georgia Institute of Technology, Atlanta, GA in 1980, all in
electrical engineering.
From 1972 to 1975 he was with Bell Laboratories, Holmdel, NJ, where
he was involved with transmission planning for speech communications and
simulation and subjective testing of speech processing algorithms. From 1975 to
1977 he attended the Georgia Institute of Technology to study communication
theory and digital signal processing. From 1977 to 1980 he was with the
Submarine Signal Division, Portsmouth, RI, where he engaged in research
on autoregressive spectral estimation and the design of sonar systems. He is
presently a Professor of Electrical Engineering at the University of Rhode Island,
Kingston and a consultant to numerous industrial concerns, the Air Force, the
Army, and the Navy. As a leading expert in statistical signal processing, he has
been invited to teach short courses to scientists and engineers at government
laboratories, including NASA and the CIA. His current interests are spectrum
analysis, detection and estimation theory, and statistical signal processing.
Dr. Kay has written numerous journal and conference papers and is a
contributor to several edited books. He is the author of the textbooks Modern
Spectral Estimation (Prentice-Hall, 1988), Fundamentals of Statistical Signal
Processing, Vol. I: Estimation Theory (Prentice-Hall, 1993), Fundamentals of
Statistical Signal Processing, Vol. II: Detection Theory (Prentice-Hall, 1998), and
Intuitive Probability and Random Processes using MATLAB (Springer, 2005). He is
a member of Tau Beta Pi and Sigma Xi. He has been a distinguished lecturer for
the IEEE Signal Processing Society. He has been an associate editor for IEEE
Signal Processing Letters and IEEE Transactions on Signal Processing. He has
received the IEEE Signal Processing Society Education Award “for outstanding
contributions in education and in writing scholarly books and texts: : :” He has
recently been included on a list of the 250 most cited researchers in the world in
engineering.
Quan Ding received his B.S. degree from Shanghai Jiao Tong University,
Shanghai in 2006 and the M.S. and Ph.D. degrees from the University of Rhode
Island, Kingston, RI in 2008 and 2011, respectively, all in electrical engineering.
Since 2011 he has been a postdoctoral fellow in electrical engineering at the
University of Rhode Island. His primary research interests include statistical
signal processing, detection and estimation, sensor fusion, spectral analysis, and
biomedical signal processing.
592 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 49, NO. 1 JANUARY 2013
Muralidhar Rangaswamy (S’89–M’93–SM’98–F’06) received his B.E.degree in electronics engineering from Bangalore University, Bangalore, India
in 1985 and M.S. and Ph.D. degrees in electrical engineering from Syracuse
University, Syracuse, NY in 1992.
He is presently employed as the technical advisor for the RF Exploitation
Technology Branch within the Sensors Directorate of the Air Force Research
Laboratory (AFRL). Prior to this he held industrial and academic appointments.
His research interests include radar signal processing, spectrum estimation,
modeling non-Gaussian interference phenomena, and statistical communication
theory.
He has coauthored more than 150 refereed journal and conference record
papers in the areas of his research interests. Additionally, he is a contributor to 5
books and is a coinventor on 3 U.S. patents. He is the technical editor (Associate
Editor-in-Chief) for Radar Systems in the IEEE Transactions on Aerospace and
Electronic Systems (IEEE-TAES). He served as the coeditor-in-chief for the
Digital Signal Processing Journal between 2005 and 2011. He serves on the
senior editorial board of the IEEE Journal of Selected Topics in Signal Processing
(January 2012—December 2014). He was a 2-term elected member of the Sensor
Array and Multichannel Processing Technical Committee (SAM-TC) of the
IEEE Signal Processing Society between January 2005 and December 2010 and
serves as a Member of the Radar Systems Panel (RSP) in the IEEE—AES Society.
He was the general chairman for the 4th IEEE Workshop on Sensor Array and
Multichannel Processing (SAM-2006), Waltham, MA, July 2006. He has served
on the Technical Committee of the IEEE Radar Conference series in a myriad
of roles (track chair, session chair, special session organizer and chair, paper
selection committee member, and tutorial lecturer). He served as the publicity
chair for the First IEEE International Conference on Waveform Diversity and
Design, Edinburgh, UK, November 2004. He presently serves on the conference
subcommittee of the RSP.
Dr. Rangaswamy received the 2004 (Fred Nathanson Memorial) Outstanding
Young Radar Engineer Award from the IEEE AES Society, the 2006
Distinguished Member Award from the IEEE Boston Section, the 2007 IEEE
Region 1 Award, and the 2005 Charles Ryan Basic Research Award from the
Sensors Directorate of AFRL, in addition to more than 40 AFRL scientific
achievement awards.
KAY, ET AL.: SENSOR INTEGRATION BY JOINT PDF CONSTRUCTION USING THE EXPONENTIAL FAMILY 593