12
240 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005 Optimizing the Multiwavelet Shrinkage Denoising Tai-Chiu Hsung, Member, IEEE, Daniel Pak-Kong Lun, Member, IEEE, and K. C. Ho, Senior Member, IEEE Abstract—Denoising methods based on wavelet domain thresh- olding or shrinkage have been found to be effective. Recent studies reveal that multivariate shrinkage on multiwavelet transform co- efficients further improves the traditional wavelet methods. It is because multiwavelet transform, with appropriate initialization, provides better representation of signals so that their difference from noise can be clearly identified. In this paper, we consider the multiwavelet denoising by using multivariate shrinkage function. We first suggest a simple second-order orthogonal prefilter design method for applying multiwavelet of higher multiplicities. We then study the corresponding thresholds selection using Stein’s unbi- ased risk estimator (SURE) for each resolution level provided that we know the noise structure. Simulation results show that higher multiplicity wavelets usually give better denoising results and the proposed threshold estimator suggests good indication for optimal thresholds. Index Terms—Multiwavelet, parameter estimation, prefilter, smoothing methods, wavelet transforms, white noise. I. INTRODUCTION S UPPOSE we are going to estimate from noisy observa- tion (1) where is independent and identically dis- tributed (iid) noise. The goal of denoising is to min- imize the mean square error (MSE) MSE (2) subject to the condition that is at least as smooth as , where . Wavelet thresholding methods [1]–[6] have been proven to be effective in estimating from since the wavelet transform represents a signal using basis functions that are localized in time and scale simultaneously. Signal en- ergy tends to cluster into a few number of wavelet coefficients with large amplitudes. It is different from the wavelet coeffi- cients of noise that tend to scatter in the time-scale space with Manuscript received July 30, 2003; revised December 17, 2003. This work was supported by a grant from the Research Grant Council of the Hong Kong Special Administrative Region, China, under Project B-Q706 and a grant from the Hong Kong Polytechnic University under Project A418. The associate editor coordinating the review of this paper and approving it for publication was Proc. Ziziang Xiong. T.-C. Hsung and D. P.-K. Lun are with the Centre for Multimedia Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong (e-mail: [email protected], [email protected]). K. C. Ho is with the Department of Electrical and Computer Engineering, University of Missouri, Columbia, MO 65201 USA (e-mail: HoD@mis- souri.edu). Digital Object Identifier 10.1109/TSP.2004.838927 small amplitudes. The multiwavelet [7], [8] extends the idea of the wavelet by representing signal with more than one scaling function. It is found that these scaling functions can be designed to be simultaneously symmetric, orthogonal, and have short sup- ports, which cannot be achieved at the same time for wavelet systems using only one scaling function. In [9]–[12], it is found that multiwavelet denoising with multivariate shrinkage gives consistently better results than using wavelet shrinkage. The im- provement is contributed by the multiwavelet transform as well as the multivariate shrinkage operator, which effectively exploit the statistical information of the transformed coefficient vectors of noise. In applying the discrete multiwavelet transform on the scalar signal, it is necessary to perform proper initialization to ob- tain the finest scale scaling coefficient vectors from the sam- pled scalar signal. It is usually achieved by using a prefilter [13], [14]. Although an alternate way is sometimes adopted by using the “balanced” multiwavelet basis [15], filters with longer length result. Many methods have been suggested for designing prefilters [14], [16]–[20]. They enable the resulting filterbank to possess desired approximation power and prop- erty such as orthogonality. However, there is no simple method to obtain the prefilter for higher multiplicity . This limits the application of multiwavelets of higher multiplicity, which potentially give better performance compared with the tradi- tional ones due to their better characterization of signals. In this paper, we first suggest a simple method for the design of second-order approximation preserving orthogonal prefilters for any multiplicity, which enable higher multiplicity wavelet appli- cations, such as denoising. Experimental results show that de- noising using higher multiplicity wavelets usually gives better performance. Good selection of the parameter set is critical to the success of multiwavelet shrinkage denoising. Since the multivariate shrinkage function is different from that used in wavelet shrinkage, we cannot borrow the risk estimators suggested for scalar wavelet shrinkage to the selection of the thresholding parameters. Furthermore, the components of a transform co- efficient vector are mutually dependent. It makes the problem of finding the optimal parameter set more difficult. Recently, risk estimators for multiwavelet denoising are suggested [21]. However, they can only be used to derive a single threshold for all resolution levels and may not be optimal in case of nonwhite noises. In this paper, we suggest a risk estimator that allows us to optimally select the parameter set for multivariate shrinkage based on the principle of Stein’s unbiased risk estimator (SURE) [22]. The resulting parameters are level-de- pendent such that they can adapt to the characteristic of signal in different resolutions. The proposed risk estimator closely resembles the mean square error (MSE), as shown in (2) for 1053-587X/05$20.00 © 2005 IEEE

Optimizing the multiwavelet shrinkage denoising

Embed Size (px)

Citation preview

240 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005

Optimizing the Multiwavelet Shrinkage DenoisingTai-Chiu Hsung, Member, IEEE, Daniel Pak-Kong Lun, Member, IEEE, and K. C. Ho, Senior Member, IEEE

Abstract—Denoising methods based on wavelet domain thresh-olding or shrinkage have been found to be effective. Recent studiesreveal that multivariate shrinkage on multiwavelet transform co-efficients further improves the traditional wavelet methods. It isbecause multiwavelet transform, with appropriate initialization,provides better representation of signals so that their differencefrom noise can be clearly identified. In this paper, we consider themultiwavelet denoising by using multivariate shrinkage function.We first suggest a simple second-order orthogonal prefilter designmethod for applying multiwavelet of higher multiplicities. We thenstudy the corresponding thresholds selection using Stein’s unbi-ased risk estimator (SURE) for each resolution level provided thatwe know the noise structure. Simulation results show that highermultiplicity wavelets usually give better denoising results and theproposed threshold estimator suggests good indication for optimalthresholds.

Index Terms—Multiwavelet, parameter estimation, prefilter,smoothing methods, wavelet transforms, white noise.

I. INTRODUCTION

SUPPOSE we are going to estimate from noisy observa-tion

(1)

where is independent and identically dis-tributed (iid) noise. The goal of denoising is to min-imize the mean square error (MSE)

MSE (2)

subject to the condition that is at least as smooth as , where. Wavelet thresholding methods [1]–[6]

have been proven to be effective in estimating from sincethe wavelet transform represents a signal using basis functionsthat are localized in time and scale simultaneously. Signal en-ergy tends to cluster into a few number of wavelet coefficientswith large amplitudes. It is different from the wavelet coeffi-cients of noise that tend to scatter in the time-scale space with

Manuscript received July 30, 2003; revised December 17, 2003. This workwas supported by a grant from the Research Grant Council of the Hong KongSpecial Administrative Region, China, under Project B-Q706 and a grant fromthe Hong Kong Polytechnic University under Project A418. The associate editorcoordinating the review of this paper and approving it for publication was Proc.Ziziang Xiong.

T.-C. Hsung and D. P.-K. Lun are with the Centre for MultimediaSignal Processing, Department of Electronic and Information Engineering,The Hong Kong Polytechnic University, Kowloon, Hong Kong (e-mail:[email protected], [email protected]).

K. C. Ho is with the Department of Electrical and Computer Engineering,University of Missouri, Columbia, MO 65201 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TSP.2004.838927

small amplitudes. The multiwavelet [7], [8] extends the idea ofthe wavelet by representing signal with more than one scalingfunction. It is found that these scaling functions can be designedto be simultaneously symmetric, orthogonal, and have short sup-ports, which cannot be achieved at the same time for waveletsystems using only one scaling function. In [9]–[12], it is foundthat multiwavelet denoising with multivariate shrinkage givesconsistently better results than using wavelet shrinkage. The im-provement is contributed by the multiwavelet transform as wellas the multivariate shrinkage operator, which effectively exploitthe statistical information of the transformed coefficient vectorsof noise.

In applying the discrete multiwavelet transform on the scalarsignal, it is necessary to perform proper initialization to ob-tain the finest scale scaling coefficient vectors from the sam-pled scalar signal. It is usually achieved by using a prefilter[13], [14]. Although an alternate way is sometimes adoptedby using the “balanced” multiwavelet basis [15], filters withlonger length result. Many methods have been suggested fordesigning prefilters [14], [16]–[20]. They enable the resultingfilterbank to possess desired approximation power and prop-erty such as orthogonality. However, there is no simple methodto obtain the prefilter for higher multiplicity . This limitsthe application of multiwavelets of higher multiplicity, whichpotentially give better performance compared with the tradi-tional ones due to their better characterization of signals. Inthis paper, we first suggest a simple method for the design ofsecond-order approximation preserving orthogonal prefilters forany multiplicity, which enable higher multiplicity wavelet appli-cations, such as denoising. Experimental results show that de-noising using higher multiplicity wavelets usually gives betterperformance.

Good selection of the parameter set is critical to the successof multiwavelet shrinkage denoising. Since the multivariateshrinkage function is different from that used in waveletshrinkage, we cannot borrow the risk estimators suggested forscalar wavelet shrinkage to the selection of the thresholdingparameters. Furthermore, the components of a transform co-efficient vector are mutually dependent. It makes the problemof finding the optimal parameter set more difficult. Recently,risk estimators for multiwavelet denoising are suggested [21].However, they can only be used to derive a single thresholdfor all resolution levels and may not be optimal in case ofnonwhite noises. In this paper, we suggest a risk estimator thatallows us to optimally select the parameter set for multivariateshrinkage based on the principle of Stein’s unbiased riskestimator (SURE) [22]. The resulting parameters are level-de-pendent such that they can adapt to the characteristic of signalin different resolutions. The proposed risk estimator closelyresembles the mean square error (MSE), as shown in (2) for

1053-587X/05$20.00 © 2005 IEEE

HSUNG et al.: OPTIMIZING THE MULTIWAVELET SHRINKAGE DENOISING 241

any multiplicity, and hence, the thresholds obtained approachoptimum.

The organization of this paper is as follows. In Section II,we first fix the notations and present a brief review on multi-wavelet denoising. In Section III, we present a simple methodfor the design of second-order approximation preserving or-thogonal prefilters. In Sections IV and V, we study the effectof multivariate shrinkage and derive risk estimator based onSURE. In Sections II–V, we verify our proposed methods byapplying them on the denoising of several popular test signalswith various settings. Their performances are then analyzedand discussed.

II. MULTIWAVELET DENOISING

A. Multivariate Shrinkage

In [4] and [5], Donoho suggested the method of soft-thresh-olding (shrinkage). It is defined as follows.

1) Compute the wavelet transform of the observed signal.

2) Apply the shrinkage functioncoordinatewise to the empirical wavelet transform coeffi-cients with a threshold .

3) Obtain the denoised signal by the inverse wavelettransform.

For the case of multiwavelet denoising, the denoising procedureis similar to that of wavelet denoising, but the shrinkage func-tion is modified to multivariate shrinkage [11]. The multivariateshrinkage operator for a transform coefficient vector is de-fined as

for (3)

for (4)

where , and denotes the shrinkage parameters. denotes the threshold for the shrinkage operation per-

formed at a particular resolution level, is the covariance ma-trix of noise, and is a coefficient vector. Geometrically, allvectors are shrunk toward the origin according to the shapeof the noise distribution rather toward the component axes (asshown in Fig. 1). Orthogonal multiwavelet denoising by mul-tivariate shrinkage leads to better performance over traditionalshrinkage because of two reasons. First, a higher multiplicityorthogonal wavelet allows a signal to be represented by a linearcombination of several mother wavelets with various transla-tions and scales. It enables the application of multivariate statis-tics [11] to the transform coefficient vectors and provides us avery clear picture on the differences of noises and signals in var-ious scales. It is noticed that noises become multivariate normaldistributed in the transform domain, whereas signals generallydo not and are of stronger magnitude. It is a more reliable toolfor us to differentiate signal from noise as compared with usingonly one mother wavelet. Second,the traditional shrinkage func-tion, which shrinks the wavelet coefficients with respect to asingle basis function, cannot achieve the best in the multivariatecase because it has not taken into account the mutual behavior ofthe signal with respect to the multiple basis functions given by

multiwavelets. It can be seen from Fig. 1 that the componentof the coefficient vector will be shrunk to zero if using tradi-tional shrinkage, but it remains if we use multivariate shrinkage,which is more reasonable because it does not likely behave asnoise.

Selection of thresholds is critical to the success ofwavelet denoising, Donoho suggested the universal threshold

and proved that soft thresholding with thisthreshold has a minimax optimality property. Similarly, Downieand Silverman derived the multivariate universal threshold [11]for multiwavelet denoising: .However, the universal threshold may not be optimal in thesense of (2). It is because the thresholding method also removesthe wavelet coefficients of the signal in addition to noise if themagnitudes of observation are both smaller than the threshold.This motivates much research on the choice of the thresholdingparameter , which balances on killing noisy and signal com-ponents. In [4] and [6], the principle of Stein’s unbiased riskestimator (SURE) is used to find the optimal threshold for theunknown signal. However, there is no SURE estimator for eachlevel of the multiwavelet transform.

B. Covariance Matrices

Let us define the vector filterbank for the levels discretemultiwavelet transform of multiplicity as and the cor-responding prefilter as . Then, we can write the discretemultiwavelet transform for scalar signals as . Thediscrete multiwavelet transform of the noisy signal becomes

, where and . The ma-trix is designed such that the levels output transformcoefficient vectors are arranged into an by 1 vector, i.e.,

, where is the th transform co-efficient vector of at scale ,and each vector containselements. For each scale , we have a different covariancematrix , where , is the noise power, asshown in (1). If we are given the covariance structures of thenoise , say , we can compute the distributionparameters of the transformed noise for scale :and , where is the equivalent multiwavelettransform matrix at level . One can also estimate by usingrobust statistics, as suggested in [11].

We know that a multivariate normal distribution is convertedinto another multivariate normal distribution after the linear ma-trix transform. If we are given the covariance structures of themultivariate normal distributed noise , and the matrixtransform, we can compute the distribution parameters of thetransformed noise. Since we can treat the multiwavelet trans-form as a linear transform and express them in matrix form, i.e.,

, for levels, the transform matrix is given by

for (5)

The highpass multiwavelet transform matrix is constructedfrom its matrix filter coefficients asfollows:

242 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005

Fig. 1. Multivariate shrinkage function.

and is a band diagonal matrix

. . .

. . .

where is the prefilter coefficients with length . For pre-filtering with decimation factor , is the matrix.

and are obtained in a similar way so that the dimen-sion of the matrices multiplications in (5) are matched. Then,the distribution parameters of noise coefficient vectors for level

is given by

(6)

(7)

In [11] and [12], it is suggested to adopt robust statistics methodsto estimate the covariance structure of the transformed noisecoefficient vectors. For simulation, we assume that is knownsuch that we can use (7) to compute the covariance matrix .

III. HIGHER MULTIPLICITY PREFILTERING

A. Strang–Fix Conditions

For the design of higher multiplicity prefilters, we follow theapproach suggested in [14] and [16]. Let us recall the conditionsfor a multiscaling function to provide a given order of approxi-mation in this section. Consider the Hilbert spaceof all square integrable functions on . The Fourier transformof function is given by .The Fourier transform of a vector of function is given by

. Themultiwavelet can be constructed by selecting matrix filtersand that satisfy the multiresolution refinement equations asfollows:

(8)

(9)

where andare the multiscaling and multi-

wavelet functions, respectively. Equations (8) and (9) have thefollowing equivalent form in the Fourier transform domain:

From [14] and [16], we consider the expansion ofscalar signals by using all the multiscaling functions, i.e.,

. To relate and the samplesof , we first construct a superfunction by a linear combi-nation of multiscaling functions as follows:

(10)

where is a finite-supported sequence of vectors. It isdesirable that the superfunction has a lowpass property sincewe can expand bandlimited signals by ,and the coefficients can be approximated from the sampledscalar signal for large . Furthermore, it is shownthat the superfunction will satisfy the Strang–Fix conditions oforder if there exists such that

(11)

for and (12)

This implies that the function has the approximation powerof order up to such that

for polynomial up to order . To ensure the super-function has higher approximation power, we require the pre-filters to fulfill (11) and (12) with all values of .Besides, the combined filters should have both low and highpassproperties, as shown in the following equations:

where is the -by-one vector with all elements zero, and; . To design

a prefilter wherein the associated superfunction satisfies theStrang–Fix conditions, it should be first noted that if thereis a superfunction satisfying the Strang–Fix conditions, i.e.,

HSUNG et al.: OPTIMIZING THE MULTIWAVELET SHRINKAGE DENOISING 243

(11) and (12), it is equivalent that there exists (whichgenerates ) and such that

(13)

where denotes the th derivatives of w.r.t. ,, and satisfies

(14)

(15)

The resulting superfunction given by (10) with the prefiltercoefficients given by (13) has an approximation power up toorder . This rewrites the Strang–Fix conditions (11)–(15).Hence, to design a prefilter where the associated superfunctionsatisfies the Strang–Fix conditions, we can first compute thesolution of the eigen-equations (14) and (15). Then, we lookfor a finite vector sequence that satisfies (13).

B. Second-Order Approximation Preserving OrthogonalPrefilter

In [14] and [18], the orthogonal prefilter is suggested for theprefiltering scheme, as illustrated in Fig. 2, which is in the form[2] (see also the combined filters in Fig. 3)

...

(16)

where , , andfor , and

is an -by- identity matrix, and are unit vectors thatmust satisfy

To make the resulting prefilter enables second-order approxi-mation preserving, it needs to satisfy the Strang–Fix conditions(13)–(15) for , 1. That is, , and

. From (16), differentiate w.r.t. , and we have

Fig. 2. Prefiltering and postfiltering schemes.

Fig. 3. Combined filters.

Therefore, for

(17)

Then, the procedure of finding the prefilter is equivalent to findparameters and that satisfy (17). In order to make (17) solv-able, we need to select enough number of taps such thatthe matrix is nonsingular. It is satisfied when . Let usdenote . Consider the case , where

is nonsingular, and it can be rewritten into the following form:

(18)

such that (19)

is null space of , and is anparameter vector . Then, we can constructan orthogonal matrix

(20)

such that . Then, apply to the left- andright-hand sides of (18) to get

(21)

Let us consider the terms on the left-hand side of (21)one-by-one. Since is an orthogonal matrix,

. In addition, matrices are trans-formed into another set but still in their original form

, where . Forthe last term on the right-hand side of (21), recall from(20) that . Then,

diag and we arrive at thefollowing formulation:

diag (22)

244 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005

TABLE ITHRESHOLDS AND THE CORRESPONDING SQUARE ERRORS DERIVED FROM THE SQUARE ERROR FUNCTION, BSURE AND LSURE WHEN APPLYING TO THE

DENOISING OF “QUADCHIRPS,” AS SHOWN IN FIG. 4, WITH SIGNAL LENGTH 2 AND RNR 5 USING DGHO5 SETTING

Suppose that we select symmetric matrix ; then, issymmetric, hence, .

diag

where we denote and

(23)

Since are unit vectors, must satisfy

. . . diag

Since we have a sufficient number of free parameters and, we may choose . Then, we can find the value

of the matrix , hence, the prefilter. Let us summarize the pro-cedure for finding a second-order approximation-preserving or-thogonal prefilter.

1) Find and that satisfy (14) and (15).2) Select a symmetric matrix from and and the

corresponding null space .3) Compute using (23), and set all the diagonal elements

to 1 to obtain .4) Compute the eigenvalue decomposition on , i.e.,

.5) Compute .6) The prefilter parameters are given by .

The prefilter is then given by (16) with the computed . Letus show a simple example. Consider the wavelet with multi-plicity three given by [24, Ex. 3.1]. and is found to be

and, respectively. We can see that all el-

ements in are nonzero, and we may simply select

diag . Then, we havefrom (20), and we also have the first equation shownat the bottom of the page. is then given by

, shown in the second equation at the bottom ofthe page. Then, ,

, and. The solu-

tion is then given by completing step 4 to 6, shown in thethird equation at the bottom of the page. For the case wheresome elements of are zero, we need to select another formof symmetric matrix that satisfies (19). We further apply thesame method to obtain second-order approximation preservingorthogonal prefilters for multiwavelets of higher multiplicities,i.e., 4 and 5, as shown in Appendixes A and B.

IV. EFFECT OF MULTIVARIATE SHRINKAGE

In this section, we study the effect of applying the multivariateshrinkage function on the observation . This would give us theessential relations for making estimation on the mean square

HSUNG et al.: OPTIMIZING THE MULTIWAVELET SHRINKAGE DENOISING 245

TABLE IISQUARE ERROR WITH OPTIMAL THRESHOLDS WHEN APPLYING TO THE

DENOISING OF “DOPPLER.” X AND R DENOTE THE SIGNAL LENGTH 2

AND THE RNR VALUE, RESPECTIVELY

TABLE IIISQUARE ERROR WITH OPTIMAL THRESHOLDS WHEN APPLYING TO THE

DENOISING OF “HYPCHIRPS.” X AND R DENOTE THE SIGNAL LENGTH 2

AND THE RNR VALUE, RESPECTIVELY

TABLE IVSQUARE ERROR WITH OPTIMAL THRESHOLDS WHEN APPLYING TO THE

DENOISING OF “QUADCHIRPS”1.X AND R DENOTE THE SIGNAL LENGTH 2

AND THE RNR VALUE, RESPECTIVELY

error funtion (2). Let us define the square norm of the shrunkobservation as follows:

(24)

The corresponding expectation is

Tr (25)

where is the difference of the shrunk observa-tions and the shrunk true values, and is the total number ofcoefficient vectors. The noise vector is an vectorconstructed from random vectors of multivariate normal

. Let us study the last term in detail:

(26)It is equal to sum up the expectation of the inner products of

and on different levels. The noise vectors are distributedwith multivariate normal . With an abuse ofnotation, we consider these inner expectations without the index

for the multiwavelet coefficients at a particular resolution level. To obtain these terms, we need the following lemma.

Lemma 1: For a multivariate normal distributed ,, , and

Tr Tr

(27)

where .Proof: Let us first recall that for

and function

(28)Now, let , , where . Then, we have

(29)

246 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005

Fig. 4. Example of the risk estimators bSURE and LSURE, as compared with square error, when applying to the denoising of “QuadChirps” with signal length2 and RNR 5 using the DGHO5 setting.

where . From (3), for

Taking partial derivatives on each component of w.r.t.

where , which is an vector, andis the th row of . Then

Tr

Finally, we have the following formulation:

Tr

Tr Tr

This proves the lemma.It can be shown that if the covariance of noise is equal to ,

the last two terms will reduce to andbecomes zero for the case of multiplicity where . Then,the effect of reducing noise by multivariate shrinkage is equiva-lent to that of traditional shrinkage on each components of coef-ficient vector independently. By substituting (27) into (25), we

HSUNG et al.: OPTIMIZING THE MULTIWAVELET SHRINKAGE DENOISING 247

Fig. 5. Performance of the risk estimators bSURE and LSURE, with 100 trails of multivariate shrinkage denoising on several test signals of signal length: 2 .X axes are the RNR values. Y axes show the SER = log ((SE � SE )=SE ), where SE is the total square error with estimated thresholds, and SE isthe total square error with optimal thresholds (level dependent). (a): SER of “Blocks.” (b) SER of “Bumps.” (c) SER of “Cusp.” (d) SER of “HypChirps.” (e) SERof “Piecewise-Polynomial.” (f) SER of “QuadChirps.”

obtain the risk function for a particular parameter set . For prac-tical implementation, (25) and (26) can be approximated as in(30). For a particular level , with the knowledge of the noisestructure

LSURE

Tr

Tr

Tr

(30)

where , and is the number of coefficientvectors at level . The operator counts thenumber of nonzero coefficient vectors after shrinkage, whichapproximates the first term of (27). The optimal threshold isthen the one that minimizes LSURE in each level.

V. NUMERICAL EXPERIMENTS

In this section, we show the performance of higher multi-plicity wavelet denoising and the proposed LSURE. To keep thesimulations simple, the prefiltering is nondecimating, whereasthe discrete multiwavelet filterbank is decimating. We use linearfiltering if the signal does not have energy near boundaries; oth-erwise, we symmetrically extend the signal at the boundaries.

248 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005

Fig. 6. Performance of the risk estimators bSURE and LSURE, with 100 trails of multivariate shrinkage denoising on several test signals of signal length: 2 .X axes are the RNR values. Y axes show the TR = log ( ((Th � Th )=Th ) ), where Th is the optimal threshold in level b, and Th

is the estimated threshold in level b. (a) TR of “Blocks.” (b) TR of “Bumps.” (c) TR of “Cusp.” (d) TR of “HypChirps.” (e) TR of “Piecewise-Polynomial.” (f) TRof “QuadChirps.”

The experiments were carried out on 1-D test signals withsamples at several noise levels measured in terms of root

signal-to-noise ratio RNR var . The test signalsinclude “Bumps,” “Blocks,” “Cusp,” “Piecewise polynomials,”“HypChirps,” and “QuadChirps,” which are used in [4]–[6].We measure the performance of different denoising algorithmsfor each resolution level in terms of mean square errorSE . The following multiwavelets andprefilters settings are used:

1) wavelet of multiplicity 2 [7], [14] with Xia’s orthogonalprefilter [14] (GHMXIA);

2) wavelet of multiplicity 3 ([24, app. 2]) with the proposedorthogonal prefilter (DGHO3);

3) wavelet of multiplicity 4 ([23, table 1]) with the proposedorthogonal prefilter (DGHO4);

4) wavelet of multiplicity 5 ([23, table 2]) with the proposedorthogonal prefilter (DGHO5).

The multiwavelet filter coefficients can be found in the refer-ences as mentioned above. The orthogonal prefilters are de-rived using the proposed algorithm in Section III-B. See theAppendix for the corresponding orthogonal prefilter parametersfor DGHO4 and DGHO5. The covariance matrices can be com-puted as suggested in (7).

HSUNG et al.: OPTIMIZING THE MULTIWAVELET SHRINKAGE DENOISING 249

In the simulations, we test the following risk estimators:

1) SURE estimator borrowed from wavelet shrinkagebSURE [22]

bSURE Tr

Tr (31)

2) proposed LSURE , in (30).

Thresholds are derived by minimizing the above risk estimators.In the simulations, we also generate a set of results by usingthe theoretical optimal thresholds for comparison. The optimalthresholds are obtained by exhaustively searching for the thresh-olds that minimize the mean square error, as shown in (2). Notethat in practice, the mean square error can never be obtainedsince we do not have the original signal. Hence, the optimalthresholds shown here are only for comparison purposes.

First of all, let us have a look at the performances of differentdenoising methods by using optimal thresholds with differentmultiwavelet bases. The results are obtained by averaging 100trails of multiwavelet denoising. Table I shows the estimatedthresholds by using bSURE and LSURE for each resolutionlevel and the corresponding square errors. The values are com-pared with the optimal ones. It is seen that the performancegiven by LSURE is obviously much better than bSURE. Thesignals are contaminated with additive white Gaussian noise atRNR 10, 7, 5, and 2. Tables II–IV show the total square error ofthe denoised results for “Doppler,” “HypChirps,” and “Quad-Chirps,” respectively, using optimal thresholds. For these sig-nals, we can see that in most cases, higher multiplicity waveletsgive better performance.

For the performance of the proposed risk estimator, we showin Fig. 4 an example for the denoising of “QuadChirps” withsample size and RNR 5 using the DGHO5 setting. We cansee that the traditional bSURE does not work for higher multi-plicity at any resolution level. On the contrary, LSURE closelyresembles the mean square error function.

In Figs. 5 and 6, we further show the performance ofthe proposed risk estimator for different test signals at dif-ferent noise levels and different multiplicities. The results

shown are the average of 100 trails of the multivariate de-noising experiment. In Fig. 5, we introduce the measureSER SE SE SE for the evaluation ofthe denoising performances of different risk estimators andcompare them with those achieved by using the optimal thresh-olds. SE is the total square error with estimated thresholdsand SE is the total square error with optimal thresholds(level dependent). For each diagram in Fig. 5, the -axis isthe RNR values, and the -axis is the SER. We can see thatLSURE consistently gives better performance than bSURE.It is also interesting to note that higher multiplicity usuallygives better performance for LSURE. In Fig. 6, we introducethe measure TR Th Th Th toevaluate the accuracy of the estimated thresholds comparingwith the optimal ones, where Th is the optimal thresholdin level , and Th is the estimated thresholds in level . Foreach diagram in Fig. 6, the -axis is the RNR values, and the

-axis is the TR. We can see that LSURE gives much smallerTR values, as compared with bSURE, for all test signals.

VI. CONCLUSION

In this paper, we studied two issues for improving the mul-tiwavelet denoising based on multivariate shrinkage. First, wesuggested a simple method for designing second-order approxi-mation preserving orthogonal prefilter for any multiplicity. Thisenables applications using multiwavelets of higher multiplicity.Second, we investigated the risk estimators as applied to mul-tivariate shrinkage. We suggested the new LSURE for findingthresholds that approach optimum in multiwavelet denoising.Numerical experiments show that first, multivariate shrinkageof higher multiplicity usually gives better performance, andsecond, the proposed LSURE substantially outperforms thetraditional bSURE in multivariate shrinkage denoising, partic-ularly at high multiplicity.

APPENDIX A

The DGH wavelet of multiplicity 4 [24] with the orthog-onal prefilter computed using the proposed method (DGHO4)is shown in the equation at the bottom of the page.

250 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 1, JANUARY 2005

APPENDIX B

The DGH wavelet of multiplicity 5 [23] with the orthog-onal prefilter computed using the proposed method (DGHO5)is shown in the equation at the top of the page.

REFERENCES

[1] I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM,1992.

[2] P. P. Vaidyanathan, Multirate Systems and Filter Banks. EnglewoodCliffs, NJ: Prentice-Hall, 1993.

[3] S. Mallat and W. L. Hwang, “Singularity detection and processing withwavelets,” IEEE Trans. Inform. Theory, vol. 38, pp. 617–643, Mar. 1992.

[4] D. L. Donoho and I. M. Johnstone, “Ideal spatial adaptation via waveletshrinkage,” Biometrika, vol. 81, pp. 425–455, 1994.

[5] D. L. Donoho, “De-noising by soft-thresholding,” IEEE Trans. Inf.Theory, vol. 41, pp. 613–627, May 1995.

[6] D. Donoho, “Adapting to unknown smoothness via wavelet shrinkage,”J. Amer. Statist. Assoc., vol. 90, 1995.

[7] J. S. Geronimo, D. P. Hardin, and P. R. Massopust, “Fractal functionsand wavelet expansions based on several scaling functions,” J. Approxi-mation Theory, vol. 78, pp. 373–401, 1994.

[8] C. K. Chui and J.-A. Lian, “A study of orthonormal multi-wavelets,”Applied Numer. Math., vol. 20, pp. 273–298, 1996.

[9] G. Strang, “Short wavelets and matrix dilation equations,” IEEE Trans.Signal Process., vol. 43, pp. 108–115, Jan. 1995.

[10] V. Strela, P. N. Heller, G. Strang, P. Topiwala, and C. Heil, “The ap-plication of multiwavelet filterbanks to image processing,” IEEE Trans.Image Process., vol. 8, pp. 548–563, Apr. 1999.

[11] T. R. Downie and B. W. Silverman, “The discrete multiple wavelet trans-form and thresholding methods,” IEEE Trans. Signal Process., vol. 46,pp. 2558–2561, Sep. 1998.

[12] T. D. Bui and G. Chen, “Translation-invariant denoising using multi-wavelet,” IEEE Trans. Signal Process., vol. 46, pp. 3414–3420, Dec.1998.

[13] X.-G. Xia, J. S. Geronimo, D. P. Hardin, and B. W. Suter, “Design ofprefilters for discrete multiwavelet transforms,” IEEE Trans. SignalProcess., vol. 44, pp. 25–35, Jan. 1996.

[14] X.-G. Xia, “A new prefilter design for discrete multiwavelet transforms,”IEEE Trans. Signal Process., vol. 46, pp. 1558–1570, Jun. 1998.

[15] J. Lebrun and M. Vetterli, “High-order balanced multiwavelets: Theory,factorization, and design,” IEEE Trans. Signal Process., vol. 49, pp.1918–1930, Sep. 2001.

[16] G. Plonka, “Approximation properties of multiscaling functions: Afourier approach,” Rostocker Mathematische Kolloquium, vol. 49, pp.115–126, 1995.

[17] , “Approximation order provided by refinable function vectors,”Constructive Approx., vol. 13, pp. 221–244, 1997.

[18] Y. Xinxing, J. Licheng, and Z. Jiankang, “Design of orthogonal pre-filter with the strang-fix condition,” Electron. Lett., vol. 35, no. 2, pp.117–119, Jan. 1999.

[19] D. P. Hardin and D. W. Roach, “Multiwavelet prefilters -I: Orthogonalprefilters preserving approximation order p � 2,” IEEE Trans. CircuitsSyst. II, vol. 45, no. 8, pp. 1106–1112, Aug. 1998.

[20] K. Attakitmongcol, D. P. Hardin, and D. M. Wilkes, “Multiwaveletprefilters—Part II: Optimal orthogonal prefilters,” IEEE Trans. ImageProcess., vol. 10, pp. 1476–1487, Oct. 2001.

[21] E. Bala and A. Ertuzun, “Applications of multiwavelet techniques toimage denoising,” in Proc. IEEE Int. Conf. Image Process., vol. 3, NewYork, Sep. 22–25, 2002, pp. 581–584.

[22] M. Jansen, M. Malfait, and A. Bultheel, “Generalized cross validationfor wavelet thresholding,” Signal Process., vol. 56, no. 1, pp. 33–44, Jan.1997.

[23] G. C. Donovan, J. S. Geronimo, and D. P. Hardin, “Orthogonal polyno-mials and the construction of piecewise polynomial smooth wavelets,”SIAM J. Mathematical Anal., vol. 30, no. 5, pp. 1029–1056, 1998.

[24] , “Intertwining multiresolution analysis and the construction ofpiecewise polynomial wavelets,” SIAM J. Math. Anal., vol. 27, no. 6,pp. 1791–1815, 1996.

Tai-Chiu Hsung (M’93) received the B.Eng. (Hons.)and Ph.D. degrees in electronic and information en-gineering in 1993 and 1998, respectively, from theHong Kong Polytechnic University, Hong Kong.

In 1999, he joined the Hong Kong PolytechnicUniversity as a Research Fellow. His researchinterests include wavelet theory and applications,tomography, and fast algorithms.

Dr. Hsung is a member of IEE.

HSUNG et al.: OPTIMIZING THE MULTIWAVELET SHRINKAGE DENOISING 251

Daniel Pak-Kong Lun (M’91) received theB.Sc.(Hons.) degree from the University of Essex,Essex, U.K., and the Ph.D. degree from the HongKong Polytechnic University (formerly called HongKong Polytechnic), Hong Kong, in 1988 and 1991,respectively.

He is now an Associate Professor and the As-sociate Head of the Department of Electronic andInformation Engineering, Hong Kong PolytechnicUniversity. His research interests include digitalsignal processing, wavelets, multimedia technology,

and internet technology.Dr. Lun participates actively in professional activities. He was the Secretary,

Treasurer, Vice-Chairman, and Chairman of the IEEE Hong Kong Chapter ofSignal Processing in 1994, 1995–1996, 1997–1998, and 1999–2000, respec-tively. He was the Finance Chair of 2003 IEEE International Conference onAcoustics, Speech, and Signal Processing, which was held in Hong Kong inApril 2003. He is a Chartered Engineer and a corporate member of the IEE.

K. C. Ho (S’89–M’91–SM ’00) was born in HongKong. He received the B.Sc. degree with FirstClass Honors in electronics and the Ph.D. degree inelectronic engineering from the Chinese Universityof Hong Kong, in 1988 and 1991, respectively.

He was a research associate with the Departmentof Electrical and Computer Engineering, Royal Mil-itary College of Canada, Ottawa, ON, Canada, from1991 to 1994. He joined Bell-Northern Research,Montreal, QC, Canada, in 1995 as a member ofscientific staff. He was a faculty member with the

Department of Electrical Engineering, University of Saskatchewan, Saskatoon,SK, Canada, from September 1996 to August 1997. Since September 1997, hehas been with the University of Missouri-Columbia, where he is currently anAssociate Professor with the Electrical and Computer Engineering Department.He is also an Adjunct Associate Professor at the Royal Military College ofCanada. His research interests are in source localization, wavelet transform,wireless communications, subsurface object detection, statistical signal pro-cessing, and the development of efficient adaptive signal processing algorithmsfor various applications, including landmine detection, echo cancellation,and time delay estimation. He has been active in the development of the ITUStandard Recommendation G.168: Digital Network Echo Cancellers since1995. He is the editor of the ITU Standard Recommendation G.168. He hasthree patents from the United States in the area of telecommunications.

Dr. Ho is an Associate Editor of the IEEE TRANSACTIONS ON SIGNAL

PROCESSING. He was the recipient of the Croucher Foundation Studentshipfrom 1988 to 1991, and he received the Junior Faculty Research Award fromCollege of Engineering of the University of Missouri-Columbia in 2003.