VISUALIZATION OF DEEP TRANSFER LEARNING IN SAR IMAGERY …

VISUALIZATION OF DEEP TRANSFER LEARNING IN SAR IMAGERY

Abu Md Niamul Taufique Navya Nagananda and Andreas Savakis

Rochester Institute of Technology

ABSTRACT

Synthetic Aperture Radar (SAR) imagery has diverse applica-tions in land and marine surveillance Unlike electro-optical(EO) systems these systems are not affected by weather con-ditions and can be used in the day and night times With thegrowing importance of SAR imagery it would be desirableif models trained on widely available EO datasets can alsobe used for SAR images In this work we consider trans-fer learning to leverage deep features from a network trainedon an EO ships dataset and generate predictions on SAR im-agery Furthermore by exploring the network activations inthe form of class-activation maps (CAMs) we visualize thetransfer learning process to SAR imagery and gain insight onhow a deep network interprets a new modality

Index Termsmdash SAR Transfer Learning ExplainabilityClass Activation Maps

1 INTRODUCTION

Remote sensing imagery has been growing in volume andsignificance in recent years The deployment of commer-cial satellites by Planet Labs and Digital Globe is makingsatellite imagery accessible at an unprecedented scale whichrequires automated methods for analysis and interpretationDeep learning techniques have been successfully used fora variety of remote sensing scenarios However SyntheticAperture Radar (SAR) images are less accessible and harderto label than standard Electro-Optical (EO) data Fig 1shows examples of EO and SAR imagery These examplesillustrate the visual differences between the two modalitiesand the challenge of using networks pretrained on EO forSAR image classification Due to these limitations trainingdeep networks with SAR data is much more challenging

In this paper we utilize EO data for pretraining a deepnetwork and then perform transfer learning with SAR data tothe new domain We then examine the network activationsbefore and after transfer learning from EO to SAR to gaininsights into the transfer learning process across modalities

This research was supported in part by the Air Force Office of Scien-tific Research (AFOSR) under Dynamic Data Driven Applications Systems(DDDAS) grant FA9550-18-1-0121

Fig 1 Examples from the EO ships and SAR ships datasetsThe first and second rows show positive and negative sam-ples of EO ships respectively The third and fourth row showpositive and negative samples of SAR ships respectively

2 RELATED WORK

The objective of transfer learning is to transfer knowledgefrom a source domain where it is easy to obtain labeled datafor training to a target domain where data is scarce [1] Pre-training in the source domain helps generate robust features inthe network and reduces the amount of labeled data requiredfor training our model [2] In our scenario the source domaincontains EO data which is labeled and readily available

Explainability is important to interpret how the deep net-work makes decisions To examine how a deep network in-terprets a scene we use Class Activation Mapping (CAM)techniques that visualize the networkrsquos internal representa-tion Aerial CAM [3] is a method that is used to obtain theCAM to understand the networks perception and to identifysalient regions of the scene It makes use of the Global Aver-

Fig 2 Network architecture used for training on EO data andtransfer learning

age Pooling (GAP) layer to obtain the CAM The GAP layeris a weighted sum of the feature maps from the last convo-lutional layer the provides the class activation heat maps foreach predicted class [3] which are thresholded to produce amask that displays the highest activating region of the image

The Gradient-weighted Class Activation Mapping (Grad-CAM) [4] is a more general technique based on the gradientGrad-CAM makes use of the target label information avail-able at the final convolution layer to produce a coarse local-ization map that highlights the salient regions of the imagesused to produce that particular target label Grad-CAM is ap-plicable to wide variety of CNNs and does not require theGAP layer In this work we employ Grad-CAM to visualizethe salient regions before and after transfer learning from theEO to the SAR domain

3 METHODOLOGY

We designed a small convolutional network [5] suitable forthe size of the SAR dataset considered to study transfer learn-ing from EO to SAR The network architecture is shown inFig 3 The network consists of three convolutional layers anda fully connected layer A max pooling layer and ReLU ac-tivation are used between Conv1 and Conv2 In between theConv2 and Conv3 layers dropout max pooling and ReLUactivation are applied A ReLU activation layer is applied af-ter the Conv3 layer and a global average pooling layer is usedbefore the fully connected layer

31 Datasets and Experiments

In our experiments we used the dataset rdquoEO Ships in SatelliteImageryrdquo [6] and the SAR ships dataset [7] The images pro-vided in the EO ships dataset are taken using Planet satellitesover the San Francisco Bay and San Pedro Bay areas of Cal-ifornia The dataset contains 1000 chips with ships and 3000chips without ships Example chips are shown in Fig 1

The SAR ships data contains Sintel-1 Extended WideSwath images and RADARSAT-2 ScanSAR images collected

Table 1 Attributes in the SAR Ships test samplesAttributes Ships No ships

SensorGRDH 95 1756GRDM 207 4116SCNA 18 512

Polarization

HH 46 1124HV 35 554VV 106 2699VH 133 2007

Incidence AngleSmall 41 1735

Medium 102 2269Large 177 2380

in 2014 and 2015 at the South African Exclusive EconomicZone The SAR dataset contains 1596 positive samples con-taining a ship and 7980 negative samples that do not containa ship Among the positive samples we randomly selected80 for training and 20 for testing Among the negativesamples we randomly selected 20 for training and 80for testing to keep the balance between positive and negativesamples in the training set

Various attributes are annotated in the dataset which de-scribe unique conditions for the SAR image acquisition Wecategorized the dataset based on the sensor resolution polar-ization and incidence angle of the radar source signal Weonly use these attributes for categorizing the test samplesas outlined in Table 1 The Sentinel-1 has two types of dataavailable which are Ground Range Detected (GRD) imagerywith High (GRDH) and Medium (GRDM) resolutions Theresolution of GRDH is 50x50 m2 with 25x25 m2 pixel spac-ing and the resolution of GRDM is 93x97 m2 with 40x40m2 pixel spacing The RADARSAT-2 satellite has threeScanSAR Narrow (SCNA) imagery available with 81x30 m2

resolution and 25x25 m2 pixel spacing There are four typesof polarization state possible based on the electromagneticwave of the transmitted signal namely HH HV VV and VHpolarization A total of 43 Sentinel-1 images are availableamong SDH (HH+HV) SDV (VV+VH) and SSV (VV) Ex-tra Wide (EW) polarization acquisition modes No GRDMHH HV and SCNA HV VV or VH images are present inthe dataset Another important attribute for the SAR datasetacquisition is the incidence angle which is in the range of190minus470 and 200minus390 for Sentinel-1 and RADARSAT-2sensors respectively We categorized the dataset into threecategories based on the incidence angle ie small mediumand large incidence angles with the range of (190250](250350] and (350470] degrees respectively

In our experiments we initially trained our network onthe EO dataset with Adam optimizer [8] and cross-entropyloss The network was trained with a learning rate of 00001for 100 epochs This model was then used to test the SAR

Table 2 Performance comparison between EO trained model on SAR before and after performing Transfer Learning (TL)Sensor Polarization Incidence Angle

Training Class GRDH GRDM SCNA HH HV VV VH Small Medium Large Overall

Only EOShip 034 019 083 074 006 047 0 015 022 033 027

No ship 075 071 062 062 093 054 095 057 073 081 072Overall 055 045 073 068 049 051 048 036 047 057 049

TL to SARShip 089 092 083 076 094 093 093 09 091 091 091

No ship 099 094 096 098 097 097 091 096 095 094 095Overall 094 093 09 087 096 095 092 093 093 093 093

Fig 3 Ship samples showing significant difference in shipsize and brightness among SCNA HH VH and HV at-tributes shown from top to bottom rows respectively

dataset before and after transfer learning For transfer learn-ing we initialize the model weights with those obtained fromtraining the model with EO data and finetune all the layersexcept the first convolutional layer The network is finetunedusing the Adam optimizer with 00001 learning rate and cat-egorical cross-entropy loss

4 RESULTS

The normalized mean per class accuracy results are presentedin Table 2 Overall the only EO training without transferlearning does not achieve good performance given that theEO dataset and the SAR dataset have significantly differentdistributions The classification results for the RADARSAT-2SCNA HH polarized VH polarized and HV polarized im-ages chips are quite different Inspection of ship samples from

these categories revealed that most of the SCNA and HH po-larized ships are bright and easily distinguishable from thebackground while most of the VH and HV polarized shipsare small in size Due to this difference SCNA and HH po-larized ships are better recognized by the network with justthe EO training but no VH polarized ships and few HV po-larized ships are recognized Overall transfer learning to theSAR domain significantly improved the discriminative capa-bility of the network

To illustrate the transfer learning process we obtainedGrad-CAM visualizations of our network trained on EO andtested on SAR imagery with transfer learning The CAMs foronly those ships which were correctly classified in the EO andSAR domains are shown in Fig 4 From these maps we cansee that the network starts to focus more on the location of theactual ship after transfer learning

For the CAM heatmaps generated for correct classifica-tion before transfer learning shown in the middle column ofFig 4 the peaks are less pronounced and the general regionaround the object is selected as the salient region for classifi-cation After transfer learning in the heatmaps shown in theright column of Fig 4 the salient region is more pronouncedand centered on the target with lesser emphasis on the sur-rounding region

From the heat maps in the middle column of Fig 5 gener-ated using the network trained on EO before transfer learningthe regions selected for classification are not on the ship butrather on surrounding spurious areas resulting in the wrongclassification After transfer learning however the salient re-gion is shifted towards the object as shown in the right col-umn of Fig 5 thus providing the correct classification

5 CONCLUSIONS

In this work we demonstrated the effectiveness of deep trans-fer learning from EO imagery to SAR imagery using Grad-CAM network visualization technique We categorized thetest dataset based on some significant attributes annotated inthe dataset such as the type of sensor resolution polarizationand incidence angle used for image acquisition to further un-derstand the performance variation based on various attributes

Fig 4 Visualization of CAMs for SAR images (left) usingEO trained model (middle) and after transfer learning to SAR(right) The image is classified correctly both before and aftertransfer transfer learning Image (a) is from GRDM with VVpolarization (b) is from SCNA with HH polarization and (c)is from GRDH with HH polarization

of the datasetThe classification results showed interesting results where

for RADARSAT-2 and HH polarized imagery ships are wellrecognized with only EO training It is also interesting thatno ships were detected in VH polarized imagery and a fewshps were detected in HV polarized imagery with only EOtraining but performance improved significantly with trans-fer learning We attribute this behaviour to the fact that in theRADARSAT-2 imagery ships are bright and well distinguish-able from the background where the VH polarized ships aretiny in size In general transfer learning improved the classi-fication performance significantly The further demonstrationusing Grad-CAM visualization shows us the effectiveness ofdeep transfer learning from EO to SAR domain

6 REFERENCES

[1] M Oquab L Bottou I Laptev and J Sivic ldquoLearn-ing and transferring mid-level image representations us-ing convolutional neural networksrdquo in Computer Visionand Pattern Recognition June 2014 pp 1717ndash1724

[2] Mohammad Rostami Soheil Kolouri Eric Eaton andKyungnam Kim ldquoDeep transfer learning for few-shot sarimage classificationrdquo Remote Sensing vol 11 no 11pp 1374 Jun 2019

[3] B Vasu F U Rahman and A Savakis ldquoAerial-cam

Fig 5 Visualization of CAMs for SAR images (left) usingEO trained model (middle) and after transfer learning to SAR(right) The image is incorrectly classified with only EO train-ing and correctly classified after transfer learning to SARImage (a) is from GRDH with VH polarizarion (b) is fromGRDH with HV polarization and (c) is from GRDM withVV polarization

Salient structures and textures in network class activa-tion maps of aerial imageryrdquo in 2018 IEEE 13th ImageVideo and Multidimensional Signal Processing Work-shop (IVMSP) 2018

[4] R R Selvaraju M Cogswell A Das R VedantamD Parikh and D Batra ldquoGrad-cam Visual explanationsfrom deep networks via gradient-based localizationrdquo in2017 IEEE International Conference on Computer Vision(ICCV) Oct 2017 pp 618ndash626

[5] Alex Krizhevsky Ilya Sutskever and Geoffrey E HintonldquoImagenet classification with deep convolutional neuralnetworksrdquo in Advances in neural information processingsystems 2012 pp 1097ndash1105

[6] R Hammell ldquoShips in satellite imagery ver 9rdquohttpswwwkagglecomrhammellships-in-satellite-imagery 2018

[7] C P Schwegmann W Kleynhans B P Salmon L WMdakane and R G V Meyer ldquoVery deep learningfor ship discrimination in Synthetic Aperture Radar im-ageryrdquo in IEEE International Geoscience and RemoteSensing Symposium (IGARSS) July 2016 pp 104ndash107

[8] Diederik P Kingma and Jimmy Ba ldquoAdam Amethod for stochastic optimizationrdquo arXiv preprintarXiv14126980 2014

1 Introduction

2 Related Work

3 Methodology


4 Results

5 Conclusions

6 References

Fig 2 Network architecture used for training on EO data andtransfer learning

age Pooling (GAP) layer to obtain the CAM The GAP layeris a weighted sum of the feature maps from the last convo-lutional layer the provides the class activation heat maps foreach predicted class [3] which are thresholded to produce amask that displays the highest activating region of the image

The Gradient-weighted Class Activation Mapping (Grad-CAM) [4] is a more general technique based on the gradientGrad-CAM makes use of the target label information avail-able at the final convolution layer to produce a coarse local-ization map that highlights the salient regions of the imagesused to produce that particular target label Grad-CAM is ap-plicable to wide variety of CNNs and does not require theGAP layer In this work we employ Grad-CAM to visualizethe salient regions before and after transfer learning from theEO to the SAR domain

3 METHODOLOGY

We designed a small convolutional network [5] suitable forthe size of the SAR dataset considered to study transfer learn-ing from EO to SAR The network architecture is shown inFig 3 The network consists of three convolutional layers anda fully connected layer A max pooling layer and ReLU ac-tivation are used between Conv1 and Conv2 In between theConv2 and Conv3 layers dropout max pooling and ReLUactivation are applied A ReLU activation layer is applied af-ter the Conv3 layer and a global average pooling layer is usedbefore the fully connected layer

In our experiments we used the dataset rdquoEO Ships in SatelliteImageryrdquo [6] and the SAR ships dataset [7] The images pro-vided in the EO ships dataset are taken using Planet satellitesover the San Francisco Bay and San Pedro Bay areas of Cal-ifornia The dataset contains 1000 chips with ships and 3000chips without ships Example chips are shown in Fig 1

The SAR ships data contains Sintel-1 Extended WideSwath images and RADARSAT-2 ScanSAR images collected

Table 1 Attributes in the SAR Ships test samplesAttributes Ships No ships

SensorGRDH 95 1756GRDM 207 4116SCNA 18 512

Polarization

HH 46 1124HV 35 554VV 106 2699VH 133 2007

Incidence AngleSmall 41 1735

Medium 102 2269Large 177 2380

in 2014 and 2015 at the South African Exclusive EconomicZone The SAR dataset contains 1596 positive samples con-taining a ship and 7980 negative samples that do not containa ship Among the positive samples we randomly selected80 for training and 20 for testing Among the negativesamples we randomly selected 20 for training and 80for testing to keep the balance between positive and negativesamples in the training set

Various attributes are annotated in the dataset which de-scribe unique conditions for the SAR image acquisition Wecategorized the dataset based on the sensor resolution polar-ization and incidence angle of the radar source signal Weonly use these attributes for categorizing the test samplesas outlined in Table 1 The Sentinel-1 has two types of dataavailable which are Ground Range Detected (GRD) imagerywith High (GRDH) and Medium (GRDM) resolutions Theresolution of GRDH is 50x50 m2 with 25x25 m2 pixel spac-ing and the resolution of GRDM is 93x97 m2 with 40x40m2 pixel spacing The RADARSAT-2 satellite has threeScanSAR Narrow (SCNA) imagery available with 81x30 m2

resolution and 25x25 m2 pixel spacing There are four typesof polarization state possible based on the electromagneticwave of the transmitted signal namely HH HV VV and VHpolarization A total of 43 Sentinel-1 images are availableamong SDH (HH+HV) SDV (VV+VH) and SSV (VV) Ex-tra Wide (EW) polarization acquisition modes No GRDMHH HV and SCNA HV VV or VH images are present inthe dataset Another important attribute for the SAR datasetacquisition is the incidence angle which is in the range of190minus470 and 200minus390 for Sentinel-1 and RADARSAT-2sensors respectively We categorized the dataset into threecategories based on the incidence angle ie small mediumand large incidence angles with the range of (190250](250350] and (350470] degrees respectively

In our experiments we initially trained our network onthe EO dataset with Adam optimizer [8] and cross-entropyloss The network was trained with a learning rate of 00001for 100 epochs This model was then used to test the SAR

Only EOShip 034 019 083 074 006 047 0 015 022 033 027

No ship 075 071 062 062 093 054 095 057 073 081 072Overall 055 045 073 068 049 051 048 036 047 057 049

TL to SARShip 089 092 083 076 094 093 093 09 091 091 091

No ship 099 094 096 098 097 097 091 096 095 094 095Overall 094 093 09 087 096 095 092 093 093 093 093

4 RESULTS

5 CONCLUSIONS

6 REFERENCES

1 Introduction

2 Related Work

3 Methodology


4 Results

5 Conclusions

6 References

Only EOShip 034 019 083 074 006 047 0 015 022 033 027

No ship 075 071 062 062 093 054 095 057 073 081 072Overall 055 045 073 068 049 051 048 036 047 057 049

TL to SARShip 089 092 083 076 094 093 093 09 091 091 091

No ship 099 094 096 098 097 097 091 096 095 094 095Overall 094 093 09 087 096 095 092 093 093 093 093

4 RESULTS

5 CONCLUSIONS

6 REFERENCES

1 Introduction

2 Related Work

3 Methodology


4 Results

5 Conclusions

6 References

6 REFERENCES

1 Introduction

2 Related Work

3 Methodology


4 Results

5 Conclusions

6 References

Documents

VISUALIZATION OF DEEP TRANSFER LEARNING IN SAR IMAGERY …