5
1432 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 9, SEPTEMBER 2019 Remote Sensing Single-Image Superresolution Based on a Deep Compendium Model J. M. Haut , Student Member, IEEE, M. E. Paoletti , Student Member, IEEE, R. Fernandez-Beltran , J. Plaza , Senior Member, IEEE , A. Plaza , Fellow, IEEE , and Jun Li , Senior Member, IEEE Abstract—This letter introduces a novel remote sensing single- image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based SR trend stands for using deeper networks to improve the performance. However, this practice often results in the degradation of visual results. To address this issue, the proposed approach harmonizes several different improvements on the network design to achieve state-of-the-art performance when superresolving remote sensing imagery. On the one hand, the proposal combines residual units and skip connections to extract more informative features on both local and global image areas. On the other hand, it makes use of parallelized 1 × 1 convolutional filters (network in network) to reconstruct the superresolved result while reducing the information loss through the network. Our experiments, conducted using seven different SR methods over the well-known UC Merced remote sensing data set, and two additional GaoFen-2 test images, show that the proposed model is able to provide competitive advantages. Index Terms— Deep learning (DL), remote sensing, superresolution (SR). I. I NTRODUCTION I MAGE superresolution (SR) has found a fertile domain in the remote sensing field in order to overcome the optical limitations inherent to airborne and space acquisition instru- ments. Remote target tracking [1], land-cover mapping [2], and fine-grained image classification [3] are some of the most popular applications in which SR has shown to provide competitive advantages. In general, single-image SR aims at enhancing the resolution of a given input image from a single view of the target scene, that is, the SR technology pursues to recover spatial details that not captured by the imaging sensor, Manuscript received October 15, 2018; revised November 18, 2018 and January 15, 2019; accepted February 11, 2019. Date of publication March 5, 2019; date of current version August 27, 2019. This work was supported in part by the Spanish Ministry under Grant FPU14/02012-FPU15/02090, in part by Generalitat Valenciana under Grant APOSTD/2017/007, in part by the Junta de Extremadura under Grant GR15005, and in part by the Ministerio de Economía y Empresa, Spain, under Project TIN2015-63646- C5-5-R. (Corresponding author: Jun Li.) J. M. Haut, M. E. Paoletti, J. Plaza, and A. Plaza are with the Hyperspectral Computing Laboratory, Department of Technology of Computers and Com- munications, Escuela Politécnica, University of Extremadura, 10003 Cáceres, Spain (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). R. Fernandez-Beltran is with the Institute of New Imaging Technolo- gies, University Jaume I, 12071 Castellón de la Plana, Spain (e-mail: [email protected]). J. Li is with the Guangdong Provincial Key Laboratory of Urbanization and Geosimulation, Center of Integrated Geographic Information Analysis, School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China (e-mail: [email protected]). Color versions of one or more of the figures in this letter are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LGRS.2019.2899576 which logically implies a high level of visual uncertainty and eventually demands strong image priors to effectively relieve the ill-posed nature of the problem. Different kinds of image reconstruction and learning para- digms have been successfully applied to superresolve remotely sensed imagery [4]. From the most traditional signal recon- struction mechanisms to the most recent machine learning algorithms, all the existing SR methodologies have their own advantages and limitations. For instance, Zhang et al. [5] pro- pose a land-cover mapping SR approach that takes advantage of the self-similarity property to learn multiscale patch rela- tions useful to increase the input image resolution. Despite the fact that this and other approaches, such as [6], do not require any external training set, their performances are usually rather unsatisfactory in applications where precision is impor- tant. Other authors make use of different machine learning paradigms to define more accurate SR models based on a training procedure using exemplar high-resolution data. For instance, Yang et al. [7] present an SR technique based on the sparse coding approach, which learns a coupled dictionary from an external high-resolution training set in order to project the input low-resolution image at a higher resolution scale. Alternative works propose the use of probabilistic models to learn the mapping between low-resolution and high-resolution image domains [8]. Even though these and other recent models [9], [10] have shown to obtain a good SR performance, they are typically based on low-level visual features, which eventually limits the SR capabilities, especially under the most challenging remote sensing scenarios. Recently, convolutional neural networks (CNNs) have shown a great potential to capture high-level features from optical data. Therefore, this paradigm has become one of the most important technologies to deal with the SR problem, and many CNN-based learning models have been proposed in the literature. For instance, Dong et al. [11] propose a deep learning (DL)-based approach to superresolve low-resolution image patches. In the training stage, the method upscales the low-resolution input images by a bicubic (BC) interpolation, and then, a three-layer CNN is trained to learn a mapping between these interpolated versions and their ground-truth counterparts. Conceptually, each layer represents: 1) patch extraction; 2) nonlinear mapping; and 3) high-resolution generation. The first two layers use a formulation based on the rectified linear unit (ReLU) on the filter responses, i.e., max(0, W Y + B ), where W and B represent the filters and biases, respectively, is the convolution operator, and Y is the input image. The third layer is a regular convolutional layer with filter responses W Y + B . Finally, the loss function used to train the parameters W and B of the network is 1545-598X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Remote Sensing Single-Image Superresolution Based on a ... · image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Remote Sensing Single-Image Superresolution Based on a ... · image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based

1432 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 9, SEPTEMBER 2019

Remote Sensing Single-Image SuperresolutionBased on a Deep Compendium Model

J. M. Haut , Student Member, IEEE, M. E. Paoletti , Student Member, IEEE, R. Fernandez-Beltran ,J. Plaza , Senior Member, IEEE, A. Plaza , Fellow, IEEE, and Jun Li , Senior Member, IEEE

Abstract— This letter introduces a novel remote sensing single-image superresolution (SR) architecture based on a deep efficientcompendium model. The current deep learning-based SR trendstands for using deeper networks to improve the performance.However, this practice often results in the degradation of visualresults. To address this issue, the proposed approach harmonizesseveral different improvements on the network design to achievestate-of-the-art performance when superresolving remote sensingimagery. On the one hand, the proposal combines residualunits and skip connections to extract more informative featureson both local and global image areas. On the other hand,it makes use of parallelized 1×1 convolutional filters (network innetwork) to reconstruct the superresolved result while reducingthe information loss through the network. Our experiments,conducted using seven different SR methods over the well-knownUC Merced remote sensing data set, and two additional GaoFen-2test images, show that the proposed model is able to providecompetitive advantages.

Index Terms— Deep learning (DL), remote sensing,superresolution (SR).

I. INTRODUCTION

IMAGE superresolution (SR) has found a fertile domain inthe remote sensing field in order to overcome the optical

limitations inherent to airborne and space acquisition instru-ments. Remote target tracking [1], land-cover mapping [2],and fine-grained image classification [3] are some of themost popular applications in which SR has shown to providecompetitive advantages. In general, single-image SR aims atenhancing the resolution of a given input image from a singleview of the target scene, that is, the SR technology pursues torecover spatial details that not captured by the imaging sensor,

Manuscript received October 15, 2018; revised November 18, 2018 andJanuary 15, 2019; accepted February 11, 2019. Date of publication March 5,2019; date of current version August 27, 2019. This work was supportedin part by the Spanish Ministry under Grant FPU14/02012-FPU15/02090,in part by Generalitat Valenciana under Grant APOSTD/2017/007, in partby the Junta de Extremadura under Grant GR15005, and in part by theMinisterio de Economía y Empresa, Spain, under Project TIN2015-63646-C5-5-R. (Corresponding author: Jun Li.)

J. M. Haut, M. E. Paoletti, J. Plaza, and A. Plaza are with the HyperspectralComputing Laboratory, Department of Technology of Computers and Com-munications, Escuela Politécnica, University of Extremadura, 10003 Cáceres,Spain (e-mail: [email protected]; [email protected]; [email protected];[email protected]).

R. Fernandez-Beltran is with the Institute of New Imaging Technolo-gies, University Jaume I, 12071 Castellón de la Plana, Spain (e-mail:[email protected]).

J. Li is with the Guangdong Provincial Key Laboratory of Urbanization andGeosimulation, Center of Integrated Geographic Information Analysis, Schoolof Geography and Planning, Sun Yat-sen University, Guangzhou 510275,China (e-mail: [email protected]).

Color versions of one or more of the figures in this letter are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/LGRS.2019.2899576

which logically implies a high level of visual uncertainty andeventually demands strong image priors to effectively relievethe ill-posed nature of the problem.

Different kinds of image reconstruction and learning para-digms have been successfully applied to superresolve remotelysensed imagery [4]. From the most traditional signal recon-struction mechanisms to the most recent machine learningalgorithms, all the existing SR methodologies have their ownadvantages and limitations. For instance, Zhang et al. [5] pro-pose a land-cover mapping SR approach that takes advantageof the self-similarity property to learn multiscale patch rela-tions useful to increase the input image resolution. Despitethe fact that this and other approaches, such as [6], do notrequire any external training set, their performances are usuallyrather unsatisfactory in applications where precision is impor-tant. Other authors make use of different machine learningparadigms to define more accurate SR models based on atraining procedure using exemplar high-resolution data. Forinstance, Yang et al. [7] present an SR technique based onthe sparse coding approach, which learns a coupled dictionaryfrom an external high-resolution training set in order to projectthe input low-resolution image at a higher resolution scale.Alternative works propose the use of probabilistic models tolearn the mapping between low-resolution and high-resolutionimage domains [8]. Even though these and other recentmodels [9], [10] have shown to obtain a good SR performance,they are typically based on low-level visual features, whicheventually limits the SR capabilities, especially under the mostchallenging remote sensing scenarios.

Recently, convolutional neural networks (CNNs) haveshown a great potential to capture high-level features fromoptical data. Therefore, this paradigm has become one of themost important technologies to deal with the SR problem,and many CNN-based learning models have been proposed inthe literature. For instance, Dong et al. [11] propose a deeplearning (DL)-based approach to superresolve low-resolutionimage patches. In the training stage, the method upscales thelow-resolution input images by a bicubic (BC) interpolation,and then, a three-layer CNN is trained to learn a mappingbetween these interpolated versions and their ground-truthcounterparts. Conceptually, each layer represents: 1) patchextraction; 2) nonlinear mapping; and 3) high-resolutiongeneration. The first two layers use a formulation basedon the rectified linear unit (ReLU) on the filter responses,i.e., max(0, W ∗ Y + B), where W and B represent the filtersand biases, respectively, ∗ is the convolution operator, and Y isthe input image. The third layer is a regular convolutional layerwith filter responses W ∗ Y + B . Finally, the loss functionused to train the parameters W and B of the network is

1545-598X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Remote Sensing Single-Image Superresolution Based on a ... · image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based

HAUT et al.: REMOTE SENSING SINGLE-IMAGE SR BASED ON A DEEP COMPENDIUM MODEL 1433

Fig. 1. Proposed model architecture, composed by FE-net (whose layer’s outputs are concatenated at the end) and the R-net (composed by the inceptionmodule, whose concatenated output feeds the upsampler, which is defined by a BC upsampling). The final output is obtained as the ensemble of eight models.

the mean square error. Despite the remarkable performanceachieved by this reference work, alternative CNN-based SRapproaches have been also proposed in the literature. In par-ticular, a relevant extension can be found in [12], whereauthors introduce several improvements based on three mainpoints: 1) defining an end-to-end mapping to avoid the initialinterpolation process; 2) reformulating the mapping layer byshrinking the input feature dimension; and 3) using a deeperarchitecture to achieve a superior restoration quality. Otherauthors propose different network improvements instead. Forinstance, Kim et al. [13] define an SR approach consideringa deeper architecture, a data augmenting scheme togetherwith residual and multiscaling learning. Even though all theseexhibit satisfactory SR performance, they have not been specif-ically designed to manage remote sensing imagery, whicheventually constrains their effectiveness over this applicationdomain. Unlike general-purpose optical data, remotely sensedimages exhibit special complexity because they are typicallyhighly detailed and fully focused multiband shots. As a result,other works propose CNN-based SR methods aimed at dealingwith satellite and aerial data: Lei et al. [14] present a DLapproach, especially designed to superresolve remote sens-ing data. Specifically, this method defines a multilevel CNNarchitecture in order to learn multiscale local and global imagefeatures when introducing new spatial details. Notwithstandingthe remarkable performance achieved by these and other SRapproaches [15], [16], the intrinsic complexity of airborne andspace optical data, together with the specific particularities anddifficulties of the DL domain, still make room for improve-ments focused on the development of new architectures ableto obtain performance advantages in the SR task.

Taking into account this scenario, this letter proposes anovel SR approach based on a deep efficient compendiummodel which integrates different improvements on the networkdesign to effectively superresolve remotely sensed images:1) residual units; 2) skip connections; and 3) network innetwork (NIN). On the one hand, the proposed architectureuses a combination of residual units and skip connectionsto extract more descriptive and informative visual featureson both local and global image areas. On the other hand,

TABLE I

KERNEL SIZES AND NUMBER OF FILTERS PER LAYER

parallelized 1 × 1 convolutional filters, also NIN [17], areemployed to generate the output superresolved image bysubstantially reducing the dimensions of the previous layersand, hence, the information loss through the network. Withthis design, the proposed approach pursues to competentlysuperresolve remote sensing data while avoiding undesirablevisual artifacts. Our experiments, which include seven differentSR methods, the UC Merced data set, and two additionalGaoFen-2 test data products, reveal a very competitive per-formance of our newly proposed approach.

II. METHODOLOGY

We propose an architecture based on fully CNN [18]. Ourmodel is composed by two parts (Fig. 1): 1) the feature extrac-tor part (FE-net) and 2) the reconstruction part (R-net), whichare connected through a “concatenation” layer, where all thefeature maps obtained by FE-net are concatenated through skipconnections before being fed to R-net. The FE-net receivesthe original image X ∈ R

n1×n2×n3 as input, which passesthrough a hierarchical set of 12 convolutional layers, denotedas C(n), in order to extract the corresponding feature mapsas a linear transformation between each layer’s input andthe layer’s kernel. The kernel size has been set to 3 × 3,in order to reduce the number of parameters while consideringenough spatial information. Also, with the aim of learning thenonlinear relationships between the data, nonlinear activation

Page 3: Remote Sensing Single-Image Superresolution Based on a ... · image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based

1434 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 9, SEPTEMBER 2019

TABLE II

QUANTITATIVE ASSESSMENT FOR THE CONSIDERED SR METHODS (IN COLUMNS) USING THREE DIFFERENT SCALING FACTORS (IN ROWS)

functions have been allocated behind each learnable layer:X(l+1) = H(W(l) ∗ X(l) + b(l)), where H(·) is implementedby the parametric ReLU (P-ReLU) [19] in order to deal withthe decaying ReLU effect and the vanishing gradient problem.In the end, all the output feature maps are concatenated andsent to R-net.

The R-net follows the NIN architecture, implemented byan inception module with two branches (layers A and B(1),with kernels 1 × 1, and B(2) with kernel 3 × 3) in orderto reduce the input volume’s depth. The obtained output isupsampled by one or more upsampling layers U of 3 × 3kernels, which increment the spatial dimension by a 2× factorthrough phase shift reshaping [20]. Then, upsampled featuremaps are processed by a convolutional layer (or reconstructionlayer with 3 × 3 kernel), which obtains a single map thatis combined with a BC-upsampled image X� ∈ R

m1×m2×m3 ,where m1 = f · n1, m2 = f · n2, and m3 = n3 beingf the scaling factor. Finally, with the aim of stabilizing theperformance of the model, a self-ensemble learning step hasbeen included [21], obtaining the final superresolved imageY ∈ R

m1×m2×m3 as the ensemble of eight models. Modeltopology is described in Table I.

III. EXPERIMENTS

A. Data Sets and MethodsThe remote sensing UC Merced data set has been considered

in our experiments, which contains 21 ground-truth classeswith 100 RGB images per class and a spatial resolution of0.3 m/pixel. The data set has been randomly split into twobalanced halves for training and test purposes. In addition,20% of the training data has been reserved for validation.Regarding the experimental protocol, the original 256 × 256high-resolution images have been initially downsampled usingthe considered scaling factor and the BC kernel to generatethe corresponding low-resolution counterparts. Then, the low-resolution test images have been superresolved using dif-ferent learning-based SR models after training and validat-ing the methods. It is important to highlight that all theexperiments have been carried out following the standard SRprocedure for RGB imagery, which is based on the uniformcolor space transformation YCbCr [4]. In order to assessthe SR performance, two reference image quality metricshave been employed: peak signal-to-noise ratio (PSNR) andstructural similarity index measure (SSIM) [22]. Our exper-iments include seven different SR methods available in theliterature: the BC interpolation kernel, the original sparsecoding approach (SR) [7] and five different DL-SR methods,i.e., SRCNN [11], FSRCNN [12], CNN-7 [14], LGCNet [14],and the proposed approach. An additional data set has alsobeen considered in order to assess the performance of theproposal when transferring the knowledge learned from theUC Merced data set to a different remotely sensed image

TABLE III

CLASS-BASED UC MERCED QUANTITATIVE SR ASSESSMENT

CONSIDERING A 3× SCALING FACTOR

collection. The considered external test collection is madeof two multispectral data products captured by the GaoFen-2satellite, where only the RGB channels (3.2 m/pixel) have beenselected for the experiments. This collection has been kindlyprovided by LGCNet authors for qualitative assessment sincethere is not available high-resolution reference data.

B. Results and Discussion

Tables II–IV provide a quantitative assessment of the exper-iments conducted over the UC Merced data set. In particular,Table II presents the average PSNR (decibel) and SSIM resultsfor the considered SR methods when using three differentscaling factors, i.e., 2×, 3×, and 4×. Table III shows theaverage PSNR (decibel) metric per class when considering a3× upscaling factor. In addition, Table III contains a summaryof the quantitative metric results as well as the correspondinginference times for the considered SR methods. Note that thebest metric value in all the tables is highlighted using boldfont.

According to the average quantitative results reportedin Table II, it is possible to point out several important observa-tions. The first remarkable point is related to the effect of usingdifferent scaling factors when superresolving the consideredtest images. As we can observe, all the considered methodsobtain a better metric result with smaller magnification ratiosbecause, logically, there is more available visual informationto introduce high-frequency components when consideringsmall factors over the UC Merced data set. Nonetheless,the proposed approach consistently provides the highest metricimprovement with respect to the second best method forall the tested scaling ratios and metrics. In the case of the

Page 4: Remote Sensing Single-Image Superresolution Based on a ... · image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based

HAUT et al.: REMOTE SENSING SINGLE-IMAGE SR BASED ON A DEEP COMPENDIUM MODEL 1435

PSNR metric, the proposed approach outperforms, on average,LGCNet in 0.20 dB, CNN-7 in 0.45 dB, FSRCNN in 0.40 dB,SRCNN in 0.70 dB, SC in 0.95 dB, and the BC baseline in2.17 dB. Analogously, the proposed approach average SSIMresult is higher on 0.013 for LGCNet, 0.020 for CNN-7,0.019 for FSRCNN, 0.026 for SRCNN, 0.030 for SC, and0.068 for BC. This initial quantitative comparison reveals thatthe most recent DL-based SR models, i.e., LGCNet, CNN-7,FSRCNN, and the proposed approach, are the most effectiveto superresolve the UC Merced remote sensing data for allthe considered scaling ratios. However, the proposed approachclearly provides the best average result in terms of the PSNRand SSIM image quality metrics.

When analyzing the quantitative SR results per class,we can also observe a remarkable improvement provided bythe proposed approach. Table III shows that the proposedapproach obtains the best PSNR result using a 3× scalingfactor in 13 UC Merced categories, whereas the second bestSR method (LGCNet) achieves the highest value in the restclasses (8). Despite the fact that LGCNet and other recentDL competitors provide a good SR performance, the proposedapproach shows a superior overall result when evaluating theindividual class results in more detail. On the one hand,the proposed approach absolute PSNR class improvementover LGCNet is 13.06 dB, which indicates the proposedarchitecture effectiveness to superresolve UC Merced remotesensing data. On the other hand, it is also possible to observethat the proposed approach is especially effective in thoseclasses that contain a high level of spatial details, such as denseresidential, harbor, or parking, and therefore, they require ahigher amount of spatial details in the SR process.

In addition to the quantitative evaluation reported byTables II and III, some visual superresolved results are shownin Figs. 2–5 as a qualitative evaluation for the tested methods.In particular, Figs. 2 and 3 show the SR results for two specifictest UC Merced images, airplane and road considering 3× and4× scaling factors, respectively. The visual results presentedin Figs. 2 and 3 reveal that each particular SR model tends togenerate different visual features on the superresolved output.On the other hand, the BC baseline together with SC andSRCNN seem very sensitive to the upscaling aliasing effect,the most recent DL-based models FSRCNN, CNN-7, LGCNet,and the proposed approach provide a substantially more robustresult for UC Merced remote sensing data, that is, SC andSRCNN models were designed to superresolve images startingfrom their corresponding interpolated versions, which logicallyintroduce an unavoidable aliasing effect that is eventuallysuperresolved in the final result. The most recent DL-SRmodels, e.g., FSRCNN, CNN-7, LGCNet, and the proposedapproach, work for relieving this effect by using deeperarchitectures that allow them to recover cleaner high-resolutionimage patterns. For instance, it is easy to appreciate in Fig. 2that BC, SC, and SRCNN introduce an important aliasingeffect on the airplane wing.

According to the visual result displayed in Fig. 3, it ispossible to see that the proposed approach is able to removea large amount of the noise present in the road details ofFSRCNN, CNN-7, and LGCNet. That is, the image linesof the proposed approach result are sharper and certainlythe most similar to their high-resolution counterparts, which

Fig. 2. UC Merced airplane test image qualitative assessment for a 3× factor.

Fig. 3. UC Merced road test image qualitative assessment for a 4× factor.

eventually leads to the best visual perceived quality. Moreover,the qualitative SR results provided in Figs. 4 and 5 show thatthe proposed model is able to reduce the ringing artifacts whencompared to other methods that also indicates the proposedmodel higher robustness to transfer the knowledge acquiredfrom the UC Merced data set to the GaoFen-2 collectiondespite the existing spatial resolution differences. Finally,the computational time results reported in Table IV reveal thatthe proposed approach is able to achieve a high computationalperformance level using a GPU NVIDIA GeForce GTX 1080,whereas the considered competitors make use of a morepowerful GPU device, i.e., NVIDIA Titan Z.

The current trend in DL-SR is inspired on usingdeeper architectures to improve the resulting performance(see [13], [14]); however, this practice may result in a poorpropagation of activations and gradients, which eventuallydegrades the quality of the convolutional features and, hence,the superresolved result because output images can be affectedby noisy artifacts. To mitigate these issues, the proposedapproach makes use of a combination of residual units andskip connections in order to extract more informative featureson both local and global image areas. On the other hand,parallelized 1 × 1 convolutional filters are used to reconstructthe resulting superresolved image by means of a networkarchitecture which substantially reduces the dimensions ofthe previous layers in order to minimize the information lossthrough the network. The combination of these improvements

Page 5: Remote Sensing Single-Image Superresolution Based on a ... · image superresolution (SR) architecture based on a deep efficient compendium model. The current deep learning-based

1436 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 9, SEPTEMBER 2019

Fig. 4. GaoFen-2 airport test image qualitative assessment for a 3× factor.

Fig. 5. GaoFen-2 factory test image qualitative assessment for a 4× factor.

TABLE IV

METRICS SUMMARY FOR THE BEST CONSIDERED SR METHODS

allows the proposed approach to produce a more effectiveremote sensing image computation while achieving state-of-the-art performance.

IV. CONCLUSION

This letter presents a novel remote sensing single-imageSR approach based on a deep efficient compendium model.The proposed architecture integrates different improvementson the network design to achieve state-of-the-art performanceto superresolve remote sensing data: 1) residual units; 2) skipconnections; and 3) NIN. Our experiments, conducted over theremote sensing UC Merced data set and GaoFen-2 tests imagesusing seven different SR methods available in the literature,reveal that the presented approach is able to archive a state-of-the-art SR performance in the remote sensing field. Futurework will be aimed at extending the proposed model to deepself-learning architectures and comprehensively analyzing the

effect of considering different network modifications withinthe remote sensing SR domain.

REFERENCES

[1] Y. Xian, Z. I. Petrou, Y. Tian, and W. N. Meier, “Super-resolved fine-scale sea ice motion tracking,” IEEE Trans. Geosci. Remote Sens.,vol. 55, no. 10, pp. 5427–5439, Oct. 2017.

[2] Y. Zhang, Y. Zhang, W. Li, Y. Huang, and J. Yang, “Super-resolutionsurface mapping for scanning radar: Inverse filtering based on the fastiterative adaptive approach,” IEEE Trans. Geosci. Remote Sens., vol. 56,no. 1, pp. 127–144, Jan. 2018.

[3] S. Hao, W. Wang, Y. Ye, E. Li, and L. Bruzzone, “A deep networkarchitecture for super-resolution-aided hyperspectral image classificationwith classwise loss,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 8,pp. 4650–4663, Aug. 2018.

[4] R. Fernandez-Beltran, P. Latorre-Carmona, and F. Pla, “Single-framesuper-resolution in remote sensing: A practical overview,” Int. J. RemoteSens., vol. 38, no. 1, pp. 314–354, 2017.

[5] Y. Zhang, F. Ling, X. Li, and Y. Du, “Super-resolution landcover mapping using multiscale self-similarity redundancy,” IEEEJ. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 11,pp. 5130–5145, Nov. 2015.

[6] J. M. Haut, R. Fernandez-Beltran, M. E. Paoletti, J. Plaza, A. Plaza,and F. Pla, “A new deep generative network for unsupervised remotesensing single-image super-resolution,” IEEE Trans. Geosci. RemoteSens., vol. 56, no. 11, pp. 6792–6810, Nov. 2018.

[7] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolutionvia sparse representation,” IEEE Trans. Image Process., vol. 19, no. 11,pp. 2861–2873, Nov. 2010.

[8] R. Fernandez-Beltran, P. Latorre-Carmona, and F. Pla, “Latent topic-based super-resolution for remote sensing,” Remote Sens. Lett., vol. 8,no. 6, pp. 498–507, 2017.

[9] B. Hou, K. Zhou, and L. Jiao, “Adaptive super-resolution for remotesensing images based on sparse representation with global joint dic-tionary model,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 4,pp. 2312–2327, Apr. 2018.

[10] X. Lu, Y. Yuan, and P. Yan, “Alternatively constrained dictionarylearning for image superresolution,” IEEE Trans. Cybern., vol. 44, no. 3,pp. 366–377, Mar. 2014.

[11] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution usingdeep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell.,vol. 38, no. 2, pp. 295–307, Feb. 2016.

[12] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolutionconvolutional neural network,” in Proc. Euro. Conf. Comput. Vis.Amsterdam, The Netherlands: Springer, 2016, pp. 391–407.

[13] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolutionusing very deep convolutional networks,” in Proc. IEEE Conf. Comput.Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 1646–1654.

[14] S. Lei, Z. Shi, and Z. Zou, “Super-resolution for remote sensing imagesvia local–global combined network,” IEEE Geosci. Remote Sens. Lett.,vol. 14, no. 8, pp. 1243–1247, Aug. 2017.

[15] Y. Tai, J. Yang, and X. Liu, “Image super-resolution via deep recursiveresidual network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,Jul. 2017, pp. 2790–2798.

[16] Y. Yuan, X. Zheng, and X. Lu, “Hyperspectral image superresolutionby transfer learning,” IEEE J. Sel. Topics Appl. Earth Observ. RemoteSens., vol. 10, no. 5, pp. 1963–1974, May 2017.

[17] M. Lin, Q. Chen, and S. Yan. (2013). “Network in network.” [Online].Available: https://arxiv.org/abs/1312.4400

[18] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networksfor semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. PatternRecognit. (CVPR), Jun. 2015, pp. 3431–3440.

[19] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers:Surpassing human-level performance on imagenet classification,” inProc. IEEE Int. Conf. Comput. Vis., Dec. 2015, pp. 1026–1034.

[20] W. Shi et al., “Real-time single image and video super-resolution usingan efficient sub-pixel convolutional neural network,” in Proc. IEEE Conf.Comput. Vis. Pattern Recognit., Jun. 2016, pp. 1874–1883.

[21] R. Timofte, R. Rothe, and L. V. Gool, “Seven ways to improve example-based single image super resolution,” in Proc. IEEE Conf. Comput. Vis.Pattern Recognit., Jun. 2016, pp. 1865–1873.

[22] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Imagequality assessment: From error visibility to structural similarity,” IEEETrans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.