Infrared Thermal Imaging-Based Crack Detection Using Deep

Received November 1, 2019, accepted December 1, 2019, date of publication December 9, 2019,date of current version December 26, 2019.

Digital Object Identifier 10.1109/ACCESS.2019.2958264

Infrared Thermal Imaging-Based Crack DetectionUsing Deep LearningJUN YANG 1, WEI WANG 1, GUANG LIN 2,3, (Member, IEEE), QING LI 1,YEQING SUN 1, AND YIXUAN SUN 41National and Local Joint Engineering Laboratories for Disaster Monitoring Technologies and Instruments, China Jiliang University, Hangzhou 310018, China2Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA.3Department of Statistics (Courtesy), Purdue University, West Lafayette, IN 47907, USA4Department of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USA

Corresponding authors: Qing Li ([email protected]) and Yeqing Sun ([email protected])

This work was supported in part by the National Key Research and Development Program of China under Grant 2017YFF0205501, in partby the Zhejiang Key Research and Development Program under Grant 2018C03035, in part by the National Natural Science Foundation ofChina under Grant 61701467, and in part by the National Science Foundation under Grant DMS-1555072, Grant DMS-1736364, andGrant DMS-1821233.

ABSTRACT Vision-based approaches are widely used in steel crack detection. After processing the imagestaken by the camera, the superficial defects can be detected. Due to the common limitation of the natureof photographic images, internal features of objects cannot be fully discovered. In order to overcome thedrawbacks of vision-based methods, this work presents an approach for detecting cracks in infrared thermalimaging steel sheets using Convolutional Neural Networks (CNN). Firstly, a horizontal heat conductionmethod is researched to thermally excite the surface of the steel sheet, and a rolling electric heating deviceis developed to be a thermal excitation source. Secondly, we analyze the temperature difference betweennormal area and different crack depths, and study the influence of temperature change. Thirdly, 3000 infraredthermograms labeled for penetrating cracks, non-penetrating cracks, and surface scratches are fabricated intoa databank. Then, the CNN is trained and validated on the data bank. Finally, improving the original FasterRegion-based Convolutional Neural Network (Faster R-CNN) network, the feature maps of multiple levelsin the feature extraction network are aggregated, and the anchor selection scheme of the Region ProposalNetwork (RPN) is adjusted from 9 to 25. The robustness of the improved Faster R-CNN is demonstratedby evaluating the detection results on the 125 images outside of the data bank, the accuracy and MeanAverage Precision (mAP) are 95.54% and 92.41%, respectively, which outperform the original algorithm byincreasing the accuracy of 3.18% and mAP of 1.88%.

INDEX TERMS Machine learning, steel plate crack, infrared thermal imaging, target detection, fasterR-CNN.

I. INTRODUCTIONSteel is the basic raw material for economic developmentand is widely used in various fields of industrial develop-ment, such as oil exploitation, chemical production, energystorage, and transportation. Steels are the main materials ofdangerous chemical tanks, petroleum energy storage tanks,sewage storage tanks, and crack defects are the direct causeof tank leakage and equipment failure. With the increasingdemand for various steel industrial equipment, the resultingsafety problems are becoming more and more prominent.Crack defects are the most common and dangerous defects

The associate editor coordinating the review of this manuscript and

approving it for publication was Zheng Xiao .

of steel plates [1]. Therefore, the steel crack monitor systemshould be an indispensable part of steel enterprises and qual-ity inspection institutions.

Nowadays, many researchers are devoted to steel defectdetection [1]–[5]. Most of these approaches can only detectthe defects on the surface of the steel plate. However,in addition to optical sampling by the steel surface reflectiveimpact, the steel surface is often accompanied by the oxidelayer, oil stain, corrosion residue, and the cracks will becovered. Many non-destructive evaluation methods are usedin crack detection [6], including ultrasonic methods (UM),radiographic methods (RM) and magnetic methods (MM).However, the record of testing using UM can not be savedin the instrumentations. RM is related to the thickness

182060 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ VOLUME 7, 2019

https://orcid.org/0000-0002-2124-0869

https://orcid.org/0000-0003-2804-5182

https://orcid.org/0000-0002-0976-1987

https://orcid.org/0000-0001-6073-1241

https://orcid.org/0000-0002-2372-5579

https://orcid.org/0000-0003-1109-3380

https://orcid.org/0000-0003-1144-7599

J. Yang et al.: Infrared Thermal Imaging-Based Crack Detection Using Deep Learning

of the material and is not a general method. MM is appro-priate for steel plates with a smooth surface. As an othernon-destructive evaluation methods, Zhu et al. [7] andBernieri et al. [8] used eddy current combined with machinelearning to detect the specific crack. In the past three decades,as a reliable and accurate technique, infrared thermogra-phy has been widely used in various fields, including crackdetection. Through infrared thermography, Maierhofer andRöllig [9] find the location of the cracks inside the joints.Bagavathiappan et al. [10], Lahiri et al. [11] show moreresearch on thermal imaging detection cracks. To overcomethe limitations of ultrasonic testing, Dorafshan et al. [6]used infrared thermography to detect defects under thesurface. With the help of Unmanned Aerial Vehicles,Dorafshan et al. [12] detected steel cracks in GPS-deniedplaces under bridges. Rodríguez-Martin et al. [13] combinedinfrared thermography with morphological to detect cracksand draw contour lines according to the characteristic ofcracks. Also, Rodríguez-Martin et al. [14] predict the crackdepth based on the correlation between temperature anddepth, but only for one type of crack.

Infrared thermal imaging detection technology detectscracks by analyzing the temperature field distribution on thesurface of the object. It has the advantages of high detec-tion efficiency, rich and complete detection information [15].The researchers used a thermal excitation source to heat theobject, and infrared cameras are employed to capture infraredthermography. Finally, they focus on applying computervision or traditional machine learning algorithms on crackdetection. Pech-May et al. [16] used pulsed laser infraredto measure the width of vertical cracks and fit the depthof the crack using the least square of the surface tempera-ture. Pahlberg et al. [17] achieved the detection of cracks inoak floor sheets by using the ultrasonic excitation thermalimaging technology. They improved detection accuracy byintroducing multiple predictors and using machine learningmethods such as random forests, which lack the ability tolocate the crack. Beuve et al. [18] used a lock-in thermog-raphy procedure to judge the crack on the surface of thesteel. Li et al. [19] employed a new segmentation algorithmto evaluate the surface crack based on the infrared thermalimages using eddy current pulsed thermography. However,these two methods cannot evaluate the subsurface damage.The key to infrared thermal imaging non-destructive testingtechnology is the study of thermal excitation methods and theidentification of infrared thermography. Most of the currentthermal excitation methods have problems that the excitationsource is inconvenient to carry, the cost is high, and thedetection may cause interference in the case of abnormal sur-face emissivity. After some improvement, Ali and Cha [20]adopted the passive heating method and used the thermalimager combined with an improved inception neural net-work to conduct subsurface crack detection of steel structurebridge. However, this method only worked on sunny days andcannot classify the types of crack. In terms of defect detec-tion algorithms, the infrared thermal image defect detection

FIGURE 1. Flowchart of the crack detection system.

algorithms adopted by these researchers are not robustenough, while accurate automated defect diagnosis and clas-sification cannot be achieved at the same time.

Deep learning algorithms bring great success in process-ing images, object detection, etc. [21]. More and moreresearchers are applying it to defect target detection. To date,with the advantages that CNN can autonomously learn thecharacteristics of the target without the specific featuresprovided by humans, the CNN can be used for detec-tion. In addition, CNN is not easily affected by noise andthe location of target objects. Kang and Cha [22] com-bined the UAV and CNN for steel crack detection in GPS-denied places. Soukup and Huber-Mörk [23] employedCNN to detect the metal surface defects on a databaseof photometric stereo images, but it only worked in thedark. Cha et al. [24] employed Faster R-CNN to real-ize the classification and location of different types ofdefects and achieved better performance. The researchersmade a great contribution to borrow the original FasterR-CNN network for crack detection. Nevertheless, the origi-nal network was not effective at detecting small objects, so itis insufficient to detect small cracks in steel plates [25].

Due to the imaging characteristics of ordinary optical cam-eras, images cannot reflect the internal features of objects andcracks under complex surfaces cannot be found. However,the surface of the chemical equipment steel plate is oftenaccompanied by rust, oxide layer and some other pollutants.To overcome the weakness of ordinary optical cameras andtap the potential of CNN, this paper integrates the CNN algo-rithm with infrared thermal imaging non-destructive testingmethods. The infrared thermal images are used to identifycracks under the surface through the surface impurities ofthe steel sheet, and are used to analyze the crack depth.Since the thermal image is also displayed in the form of animage, the thermal image is trained by the CNN algorithm todetect the crack. Additionally, this paper improves the featureextraction network structure together with the RPN networkanchor selection scheme. The detection accuracy of cracksunder various sizes is improved, especially in the detectionof small target cracks.

In particular, this paper aims to develop an efficient methodfor detecting cracks in steel sheets under complex surfaces.The system design is shown in Figure 1. Thermal excitation ofthe steel sheet is generated by an infrared thermal excitation

VOLUME 7, 2019 182061


FIGURE 2. Comparison of even heating method and uneven heatingmethod.

source. At the same time, the infrared camera is used tocollect the infrared thermal images in real-time. The types ofcracks are classified into penetrating cracks, non-penetratingcracks, and shallow surface scratches, respectively. We ana-lyze the temperature variation over both sides of differentkinds of cracks and find the difference of characteristics ofdifferent cracks in infrared images. Subsequently, the thermalimage after the thermal excitation is used to create a databank. Finally, Faster R-CNN is trained to classify and locatecracks.

II. INFRARED THERMAL IMAGING DETECTION METHODThe infrared thermal imaging nondestructive testing tech-nology is mainly judged by analyzing the abnormal con-ditions of the temperature change law at the crack [15].Dorafshan et al. [26], [27] manifested that when the ambienttemperature is lower than the object temperature, the cracksare shown as hot spots or regions in the thermal camera.Whilecracks would manifest themselves as cold spots when theregion of interest is colder than the surrounding environment.We did a lot of experiments about how to heat the steel plate.At first, we tried high energy light to conduct even thermalexcitation on a steel plate with cracks. As the thermal imageis shown in Figure 2(b), it failed to distinguish false cracksfrom real cracks. To avoid the disadvantages caused by evenheat excitation, we adopted the horizontal heating methodand conducted heat excitation on the steel plate from a singledirection (one dimension), so that the real cracks could beidentified well (Figure 2(c)). Based on this approach, withthe thermal conduction in the horizontal direction, the crackswill show up clearly.

When a certain means is used to inject heat into thesteel sheet, the internal temperature of the steel sheet isunbalanced. When the temperature of the steel sheet andthe ambient temperature are different, the heat is internallypropagated. In this study, for a steel sheet with cracks, whenthe heat is propagated inside the steel sheet, due to the barrierof crack defects, heat buildup is formed, resulting in a localhot zone where the surface temperature is high. The infraredcamera displays the surface temperature of the steel plate inthe form of an image and determines the crack by analyzingthe difference in the characteristics of the infrared thermog-raphy [15], [17], [19].

Because steel plate surface of field equipment is affectedby oxidation and storage medium for a long time, the sur-face is often accompanied by oxide layers, oil stain, and

FIGURE 3. Horizontal heat conduction diagram.

corrosion residue. These phenomena will change the emis-sivity of the steel plate surface to a large extent. As shownin Figure 2, if there are different emissivities on the surface,it will lead to abnormal high-and-low temperature mutationsin the thermal image collected by the infrared camera, whichwill affect the judgment of true cracks [28]. Aiming at theproblem of surface emissivity anomaly, this paper proposesa detection method based on horizontal heat conduction. Thedetection schematic diagram is shown in Figure 3. When thesteady state condition of the steel plate is destroyed, the heatspreads horizontally or vertically, respectively. As the heatthat diffuses horizontally reaches a crack, the heat comesfrom the left side of the steel crack gathers and the differ-ence between the temperature on the two sides increases.Crack determination by an abnormal temperature differencebetween regions can suppress interference due to local heat-ing unevenness and uneven emissivity.

Cracks on the surface or inside of the steel plate canbe shown by infrared imaging technology. When the steelplate is heated or cooled, the temperature changes in thenormal and cracked areas will produce gradient differences.The differences are shown by evident cracks in the thermalimages. In order to study the influence of heating angle onthe temperature gradient change, the temperature gradientson both sides of the crack were calculated in the heating andcooling stages, and the variation of the temperature gradienton both sides of the crack with the heating direction wasobserved. Set the direction of the heating rod and the directionof the crack as the angle. The angle is increased by 15 degreesfrom 0 to 90 degrees. The temperature gradient is shownin Figure 4. The experimental results show that when theangle between the heating rod and the crack is 0 degree,the temperature gradient is the largest, the heating effect isthe best, and the crack is the most obvious. This also showsthat the one-dimensional heat conduction effect has achievedgood results.

A. ANALYSIS OF HEAT TRANSFER MECHANISMIn the experiment, we assumed that the cracks were verticallydistributed and we used horizontal heating. In the future,

182062 VOLUME 7, 2019


FIGURE 4. Temperature gradient curves at different angles.

we will do more research on cracks with various morphologi-cal distributions and more heating methods. At first, we useda high-energy lamp to heat the surface of the steel plate asa whole (not one-dimensional). Since the scratches broke theemissivity of the original surface, the surface emissivity at thescratch was abnormal, and the collected temperature was alsoabnormal. As shown in Figure 2, it is difficult to distinguishwhether it is a crack or a scratch. On the contrary, we used ahorizontal heating method (one-dimensional), the differencebetween the crack and scratch was large. Therefore, thisarticle uses one-dimensional heat conduction.

After the steel plate is heated, the heat is diffused fromthe high-temperature region to the low-temperature region.The whole process is to realize the heat transfer through thethermal motion of the internal particles, so the heat transfermode is mainly heat conduction. The steel plate is locallythermally excited by a thermal excitation source, and theenergy absorbed per unit area of the surface is l. Heat is con-ducted in the vertical and horizontal directions inside the steelplate. Suppose the component in the horizontal direction is lx .The conduction process in the horizontal direction can bedescribed by a one-dimensional Fourier conduction equationas:

k∂2T∂x2= ρc

∂T∂t

(1)

In the equation, k is the coefficient of heat conductivity, Tis temperature, ρ is density, c is heat transfer rates, t is time,and x is dissemination distance. It is assumed that the initialtemperature of the steel plate before the thermal excitation iszero, and the steel plate is regarded as a heat infinitely thickobject, the equation is obtained:

Tn(x, t) =lx

(πρckt)12

exp(−x2

4at) (2)

In the equation, the relationship between the surface tem-perature of a crack-free steel sheet and the time after x=0 issubstituted for heat shock is:

Tn(0, t) =lx

(πρckt)12

(3)

Assuming the crack is a thermal insulation defect, when theheat is transmitted to the crack, the effect of the crack barrierwill accumulate on the crack side so that the heat change atthe deposition time can be approximated as:

Td (0, t) =lx

(πρckt)12

[1+ 2 exp(−L12

at )]

(4)

where L1 is the distance between thermal excitation sourceto the crack. Therefore, the initial temperature on one side ofthe crack is set to zero, and the heat conducted to the otherside of the crack is set to Tn(ε, t). Therefore, the differenceof temperature between these two sides can be approximatedas:

1T = Tn(x, t)− Tn(ε, t)

=lx

(πρckt)12

[1+ 2 exp(−

L21at )]− Tn(ε, t) (5)

When the crack is penetrating, Tn(ε, t) is close tozero. Therefore, the temperature difference is approximatelyexpressed as:

1T1 ≈ Tn(x, t)− 0 =lx

(πρckt)12

[1+ 2 exp(−L12

at )]

(6)

When the crack is a shallow surface scratch, Tn(ε, t) isapproximately equal to Tn(x, t). The temperature differenceis approximately expressed as:

1T2 ≈ Tn(x, t)− Tn(ε, t) = 0 (7)

After the single end of the steel plate is thermally excited,its heat is gradually transmitted to the inside of the objectand is transmitted horizontally. After the heat encounters thecrack as it moves horizontally during the conduction process,the temperature on both sides of the crack generates a certaintemperature difference. Temperature difference on both sidesis mainly related to the type of crack. Therefore, the positionand type of crack can be accessed by analyzing the changeof the temperature field between the two zones. Notably,the change is reflected as the image gradient in the thermalimage, CNN can grasp such features well [29] to accuratelylocate and classify the crack.

B. DESIGN OF THERMAL EXCITATION DEVICEIn our experiment, a rolling electric heating device servesas an origin of thermal excitation. The rolling heating rodmechanical structure is shown in Figure 5. The internal matterof heating rod is MgO, and the heating wire can reach 120◦Cin a short time. Given the upper limit of the temperature ofthe energized slip ring is 80◦C , an aluminum tube is used toseparate the heating rod and slip ring. However, this deviceis difficult to actively heat irregularly shaped iron blocks oreven iron balls and requires more power to detect large areasof steel plates. In the future, the device will be modified sothat it can be used for more types of steel plate.

During the testing process, the rolling electric heatingrod is in contact with the steel plate for thermal excitation.

VOLUME 7, 2019 182063


FIGURE 5. Rolling heating rod mechanical structure.

After a short time of stimulation, quickly scroll to the nextarea for thermal excitation. The infrared thermography isrecorded and stored in real-time during the thermal excita-tion process. Through the thermal diffusion theory, we canfind that cracks of different depths will cause significantdifference in temperature change on both sides of the crack.And we further verified this point through the temperaturechange experiment. The experiment result shows that cracksof different depths produce thermal images of different char-acteristics after thermal excitation. The thermal images afterthermal excitation are stored into a data bank, and the type andposition of the crack in the image are manually labeled. Thedata bank is used to train, validate and test the crack detectionmodel.

III. THE OBJECT DETECTION ALGORITHMAfter acquiring the data bank of the thermal image by infraredthermal imaging non-destructive testing, the target detectionalgorithm is used to detect cracks in the thermal images.

For crack detection, the traditional image processing meth-ods depend on manually extracted features and are easilyaffected by noise. Furthermore, the location and type of thecrack cannot be determined. Deep learning methods havebeen widely used in target detection and classification. Aftercollecting 3125 images, 3000 images are labeled according tothe crack conditions and fed into the R-CNN for crack featurelearning and other 125 images are used for testing. TheR-CNN can learn the crack features autonomously throughmultiple convolutional layers. During the training process,the category and position errors between the predicted resultsand the real tag values are calculated. Through backpropa-gation, this error is continuously reduced and the judgementability of the R-CNN is improved. Then the trained R-CNNcan accurately identify the crack location and crack type.Classic target detection models mainly include R-CNN, FastR-CNN, and Faster R-CNN.R-CNNfirst proposed the idea ofregion proposals to detect and localize objects [30]. However,it requires processes for Support Vector Machines (SVM)classification, bounding box regression and CNN, whichleads the detection to be expensive and hard [24]; Fast R-CNNimplements a shared convolutional layer using SPP net-works [31] and uses the Softmax and multi-tasking loss func-tion to perform border regression. Their approach requiresthe candidate frame to be extracted in advance, which is

FIGURE 6. Faster R-CNN structure diagram. It is notable that part A andpart B are merged to RPN, part A and part C are merged to Fast R-CNN,part A, part B, part C are merged to Faster R-CNN.

time-consuming [32]. Faster R-CNN unifies Fast R-CNNwith a Region Proposal Network (RPN), which shares fea-tures with Fast R-CNN, improving the detection speed [33].

A. FASTER R-CNNFaster R-CNN combines regional recommendations with FastR-CNN in a network model, replacing the selective searchwith an RPN network. The network that generates the sug-gestion box and the target detection network further sharesthe convolution. We can see the structure of Faster R-CNNfrom Figure 6.

1) MULTI-TASK LOSSFaster R-CNN unifies class output tasks and candidate frameregression tasks. The multitasking loss function is defined as:

L(p, u, tu, ν) = Lcls(p, u)+ λ[u ≥ 1]Lloc(tu, ν) (8)

In equation (8): u is the category label corresponding to thecandidate box; tu indicates that the candidate box is trans-formed into a correction vector corresponding to the real box;ν is the transformation parameter vector generated by thecandidate frame regression layer; p is the probability vectorof the output category; Lcls(p, u)indicates classification loss,defined as equation (9); Lloc(tu, ν) represents the candidatebox transformation loss function, defined as equation (10);In equation (10), sL1(·) is defined as equation (11);

Lcls(p, u) = −log(pu) (9)

Lloc(tu, ν) =∑

i∈x,y,w,h

sL1 (tui − νi) (10)

sL1 (x) =

{0.5x2 |x| < 1|x| − 0.5 else

(11)

2) RPN LAYERThe RPN layer is a fully convolutional network and is respon-sible for proposing object proposals. 2 k scores (positive,negative) from Softmax and 4 k coordinates (x coordinate, ycoordinate, width, height) of anchors from regressor will begenerated by the RPN layer. From Figure 7, we can see thestructure of the RPN network.

The sliding window processing is continued withd × n× n windows on the feature map matrix generatedby the convolutional layer of the last shared convolutionand mapped to d-dimensional feature vectors. This vector is

182064 VOLUME 7, 2019


FIGURE 7. RPN Network structure diagram.

output to the two fully connected layers, which are classi-fication layers and regression layers. Each anchor takes thecenter of the current sliding window as a reference pointand corresponds to a scale and a ratio. For a convolutionmap of size W × H there are a total W × H × k anchors.2k is defined as the overall loss function of the RPN asequation (12):

L({pi} ,{p∗i}, {ti} ,

{t∗i}) =

1Ncls

∑i

Lcls(pi, p∗i )

+ λ1Nreg

∑i

p∗i Lreg(ti, t∗i ) (12)

Here, i represents the parameter of an anchor in a mini-batch and pi represents the prediction probability that theanchor is an object. The label p∗i is 1 or 0 depending onwhether the anchor frame is positive or negative. The clas-sification loss Lcls is the logarithmic loss over two classes.We use Lreg(ti, t∗i ) = R(ti− t∗i ) as the regression loss, where Rserves as a robust loss function. p∗i Lreg represents the regres-sion loss, which is only active for positive anchors, otherwisedisabled. The outputs of classification and regression layersconsist of {pi} and {ti} respectively. The regression processuses 4 coordinates:

tx = (x − xa)/wa, ty = (y− ya)/ha (13)

tw = log(w/wa), th = log(h/ha) (14)

t∗x = (x∗ − xa)/wa, t∗y = (y∗ − ya)/ha (15)

t∗w = log(w∗/wa), t∗h = log(h∗/ha) (16)

In the equation: x, y,w, h stand for the center coordinates,box, width, and height of the bounding. Parameter x, xa, x∗

stand for the predicted box, anchor box, and ground-truth box,respectively. This can be seen as a bounding box regressionfrom the anchor box to the ground-truth box nearby.

B. IMPLEMENTATION DETAILSFaster R-CNNproduces a higher-quality and faster-resolutionwindow, and the time in the regional recommendation phaseis significantly reduced, resulting in higher detection effi-ciency. However, there are still some problems in the detec-tion of crack defects in this paper. Firstly, the number oftraining data sets is not particularly adequate due to the time-consuming production of sample data sets. This problemincreases the difficulty of training. Secondly, the originalFaster R-CNN uses the last layer of feature mapping of thefeature extraction network as input to the RPN layer. As thenumber of network layers deepens, the characteristics ofthe image become more and more abstract. The resolution

FIGURE 8. The network structure of VGG-16.

of the image is reduced. When the network performs smalltarget detection, the information of the target is seriously lostand the accuracy is low. Finally, the anchor frame selectionscheme of the original network RPN layer is not suitable.To resolve these problems, the crack detection model is opti-mized and designed.

1) VGG-16 TRANSFER LEARNINGTypical CNN models are AlexNet, ZFNet, VGGNet,GoogleNet, ResNet. Compared with other CNN models,the depth of the VGG-16 network is moderate, and the net-work has good performance. It is mostly used in the basic net-work of target detection [34]. This paper chooses the VGG-16network as the basic network of the feature extraction net-work. Since the dataset of this paper is less complex than thetraining data of theVGG-16 networkmodel, it is easy to causeover-fitting problems. This article uses transfer learning toalleviate this problem [35]. The training data of VGG-16 isthe ImageNet dataset, which has a large amount of data andcontains many objects. Although the thermal image dataset isnot as enough as the ImageNet dataset, the features extractedfrom the first few layers of the CNN are more generalized,and the extracted features are similar. Therefore, the networkof feature extraction in this paper does not adopt randominitialization but fine-tunes VGG-16. The fully connectedlayers of the network is removed by us, leaving only theconvolutional and pooling layers of the network. The numberof final 1,000 fully-connected Softmax layer changed from1000 to 3, representing the three different defect categoriesin this paper. Besides, the network parameters trained by theImageNet database are used to initialize parameters, whichhelps accelerate the network training and diminish the riskof overfitting. The network structure of VGG-16 is shownin Figure 8.

2) MULTI-LEVEL FEATURE FUSION NETWORKThis paper applies the VGG-16 network as the base networkof the feature extraction layer. The traditional Faster R-CNNhas five stages of convolutional layers, and the features areshown in the last feature map layer. After each poolinglayer, the output size is reduced by half. After five poolinglayers, the size of the output becomes 1/32 of the originalimage, the features become very abstract andmay cause someimportant features to be ignored [36]. Nevertheless, if the

VOLUME 7, 2019 182065


FIGURE 9. Improved Faster R-CNN architecture.

crack size is less than 32 pixels, the output is less than 1 pixel,the crack will not be detected very well.

Small objects need higher spatial resolution [37]. Deeplayers have a good ability to detect the object of interest, whilelow shallow layers can localize the object of interest verywell. Combining the advantages of the deep layers and shal-low layers can constitute an accurate detection system [25].This feature concatenation method makes a great success.In [38], deep layer information and shallow layer informa-tion are combined to achieve state-of-the-art segmentation.In [37], the skip pooling method is used to gather informationfrom different layers and achieve good performance. Inspiredby these works, the feature maps of the convolutional layersof the first stage, the third stage and the fifth stage are respec-tively merged after using LRN [39] in our task. To maintainthe consistency of the output image size features of the threelayers, we employ a pooling layer after the feature map of thefirst-stage convolutional layer, and perform a deconvolutionon the feature map of the fifth-stage convolutional layer. Thesize of the feature map of the three layers is made consistentwith the feature map size of the third stage. Finally, the threelayers of feature maps are merged as the feature map of thefinal feature extraction network layer output. The improvednetwork structure is shown in Figure 9. The improvementnot only increases the scale invariance of the image but alsoimproves the detection effect of small-size cracks.

3) ANCHOR RESETThe collected infrared thermal image mainly contains threetypes of cracks, including penetrating cracks, non-penetratingcracks, and scratches. In view of the difference among cracksizes in the image is large, and the ratio of the length to thewidth of the crack is more diverse, the scale and ratio ofthe anchor frame of the original RPN network are adjustedaccording to the crack size in the infrared thermogram, aswell

as the bounding box’s ratio of the length and width. Consider-ing that the shape of the crack is elongated, anchor with longshape should be set [40]. However, the anchor is a rectanglein the vertical or horizontal direction but not all cracks arevertically or horizontally distributed, so the aspect ratio of theanchor cannot be set too large. Besides, setting wider rangesof anchor sizes is to better find different types of cracks.Ren et al. manifested that the appropriate initial parameters ofFaster R-CNN were not set especially [33], so the only wayto choose suitable parameters is by trial-and-error [24], [37].Considering 9 anchors used in the original Faster RCNN arenot adequate to detect small cracks [36], after a large numberof experiments, we seek out 25 types of anchor frames afteradjustment, where the aspect ratio is set as 0.25, 0.5 1, 2,and 4. This optimization helps to improve the quality of thecandidate frame of long strip cracks. Meanwhile, the scaleof the anchor sizes is changed to 2, 4, 6, 8, and 10, whichhelps to enhance the ability to detect small crack targets. Thedimensions and proportions of the original anchor scale andthe adjusted anchor scale are shown in Table 1.

4) NETWORK TRAINING DETAILSThe most important thing for the hyperparameter selectionof the model is the learning rate of the network, which has asignificant influence on time consumption until convergence.If the learning rate is set too large, the loss function will bedifficult to converge, resulting in a lower final detection accu-racy; On the contrary, a small learning rate will lead to slowconvergence and increase the training time [41]. This paperuses cross-validation to determine the optimal learning ratefor each partial network, and the most appropriate optimallearning rate is 0.0001 and weight decay is 0.0005. For theselection of the number of network iterations, the model issaved once every 10,000 times, and the training is stoppedwhen the model no longer converges. To select a proper

182066 VOLUME 7, 2019


TABLE 1. Anchor specific parameters.

TABLE 2. Network important parameter values.

FIGURE 10. Experimental system composition.

training batch size, this paper sets the batch for each trainingto one image, which helps improve the generalization perfor-mance and makes the training more stable and steady [42].Through cross-validation combined with grid search, the pos-itive and negative sample thresholds of RPN, the proportionof foreground samples and the non-maximum suppressionthreshold parameters are obtained. The other parameters ofthe network are introduced by the original Faster R-CNN net-work. Table 2 lists the selection of some important parametersduring network training and testing.

IV. EXPERIMENTAL RESULTS AND ANALYSISA. THERMAL IMAGE ACQUISITION ANDTEMPERATURE ANALYSISThe experimental system displayed in Figure 10 is mainlycomprised of a rotating electric heating rod, a FLIRA35 infrared thermal camera, a PC, and a steel plate forinspection. The FLIR A35 is an online infrared imager witha refresh rate of 60 FPS, a resolution of 320×256 pixels, anda temperature range of−40◦C to 120◦C . The rolling electricheating rod devised in this paper serves the role of the thermalexcitation source. The experimental samples are multiple setsof steel plates with different degrees of corrosion, which con-tain penetrating cracks, non-penetrating cracks, and surfacescratches. For the purpose of enhancing the robustness of themodel, a variety of surface emissivity conditions are selectedfor image acquisition. The sample contains artificial cracksand natural cracks. For a better exploration of the temperaturechanges at different depths, individual samples are sprayedwith black to increase the emissivity. Some steel plate sampleexamples are shown in Figure 11, and some pseudo-colorthermal images are shown in Figure 12.

1) THE HEATING-COOLING THERMAL IMAGE CHANGEPROCESSIn the analysis of heating and cooling experiments, we used arelatively smooth steel plate with a penetrating crack and an

VOLUME 7, 2019 182067


FIGURE 11. Steel plate sample examples.

FIGURE 12. Pseudo-color thermal images.

artificial scratch. As mentioned before, considering the ordi-nary surface of the steel plate has low emissivity, the surface issprayed with black paint to increase the emissivity for betterstudying how the temperature changes. The purpose of thisexperiment is to detect the defect of steel plates in differentconditions, containing penetrating cracks, non-penetratingcracks and scratches. The surface of the steel sheet is ther-mally excited from a single end, thermally excited for 60 sec-onds and cooled for 60 seconds. The surface temperaturechange of the steel sheet is shown in Figure 13. It can be seenfrom Figure 13 that during the thermal excitation process,the temperature anomalies on both sides of the crack grad-ually become apparent from the heating of 10s. Under con-tinuous thermal excitation, the temperature disparity betweenthe two sides of the crack gradually rises, and the temperatureon the heating side of the crack gathers to generate a localhot zone. During the cooling process, the temperature fieldat the non-cracking point quickly reaches equilibrium, whilethe temperature difference on both sides of the crack remainsat a certain level, and the drop is slow. It can be seen fromthe cooling 10s and the cooling 20s diagram that the thermal

excitation process is not completely heated evenly. This maybe due to uneven surface fit or external interference. The heatquickly spreads to the surroundings and the ambient temper-ature reaches thermal equilibrium. However, the temperatureof the crack region always has a local hot zone throughout theprocess and produces a significant temperature gradient.

We also performed heating-cooling experiments on thesteel plates in other complex cases in Figure 11. Consideringthe high emissivity of the steel plate after oxidation and corro-sion, we did not carry out any pretreatment (such as sprayingblack paint). As shown in Figure 15, different types of crackswill be revealed. All experiments were performed indoorsand were affected by room ventilation, other external heatsources such as the room lights and the screen of the DAS.Although the experiment is affected by the environment andthe heating and cooling time is slightly different, the thermalimage can still show obvious damage and is not affected bythe experimental environment in our study. There are somedifferences in the samples obtained in different experimentalenvironments, which are helpful for training, making themodel more generalizable.

182068 VOLUME 7, 2019


FIGURE 13. Heating-cooling thermal image change.

2) TEMPERATURE CHANGES WITH TIMEPlace the high-temperature heating rod on the left side of thecrack and heat the steel plate. Two cracks are machined onthe same steel plate. The steel plate has a length of 30 cm;a width of 20 cm; and a thickness of 5 mm. With the aimof better studying how the temperature changes, the surfaceis sprayed with black paint to increase the emissivity. Thecrack has a length of 2 cm and a width of 0.5 mm. After a lotof manual processing, the penetration crack depth is 5 mmand the non-penetrating crack depth is 2 mm. Meanwhile,the black paint surface is scratched. Sp1 and sp2 are recordedon both sides of the crack and the scratch area, respectively,while sp3 and sp4 are in the normal area. The specific markimage and the temperature change of the four points in eachcase are as shown in Figure 14.

The temperature changes in the three cases are shownin Figure 14. The temperature gradient during heating isshown in the first and second rows, while the third row showsthe temperature gradient during cooling. It is worth notingthat in the first and second rows, the temperature range iswider than the cooling process beacuse of the heating rod.Given the heating pattern and the heating curve, the tem-perature at the position of sp1, sp2, sp3, sp4 is lower thanthat of the heating rod, so it appears black in the thermalimage. Since our heating device cannot control the spe-cific heating temperature, there is a slight difference in thetemperature range between the first two columns. But thisdoes not affect the temperature gradient change and crackappearance. In the future, we will optimize the heating deviceand control the steel plate at the same temperature. In thecooling process, the temperature range is smaller, and the fourpositions are brighter. In the first row, sp1 sp2 are on bothsides of the penetrating crack, sp3 and sp4 are on the normalregions. It is obvious that from the heating phase and coolingphase, the temperature gradient across the penetrating crackincreases sharply and is much larger than the temperaturegradient in the normal region. The temperature rise of sp2 onthe non-heated side is the slowest. In the second row, sp1 andsp2 are on both sides of the non-penetrating crack, sp3 andsp4 are on the normal regions. It shows that the temperature

FIGURE 14. Temperature analysis in three cases.

gradient on both sides of the non-penetrating crack remainslarger. Comparing (b) and (d) in Figure 14, the temperaturegradient on both sides of the non-penetrating crack is smallerthan that of the penetrating crack, and the temperature of thenon-heating temperature (see sp4) is also greatly increased.So it is easy to distinguish between the non-penetrating crackand penetrating crack. In the third row, sp1 and sp2 are onboth sides of the penetrating crack, sp3 and sp4 are on bothsides of the scratch. Comparing the curve of sp3 and sp4 in(b) and (d), it can be seen from (f) that the temperaturegradients on both sides of the scratch are similar to the normalarea, which is different from that of the penetrating crack.In general, temperature gradients on both sides of penetratingcracks and non-penetrating cracks are larger than crack-freeregions, and the temperature gradients on both sides of thepenetrating cracks are much larger. Moreover, the tempera-ture gradient on both sides of the scratch is just like the normalarea. The experiments above have greatly contributed to thedistinction among penetrating cracks, non-penetrating cracksand scratches, avoiding interference caused by emissivity.

B. MODEL TRAINING AND VERIFICATIONIn infrared thermal images, the chromatic aberration is usedto reflect the temperature difference. And the chromaticaberration is an image gradient in the image. Tempera-ture analysis experiments have shown that different typesof cracks and scratches vary greatly in temperature and

VOLUME 7, 2019 182069


FIGURE 15. Some examples of data sets. The first two columns are manually labeled samples, and the last two columns areoriginal grayscale images.

temperature gradient. There are features such as color andgradation on the infrared image. Capture a lot of images andtrain with the improved Faster R-CNN.

1) PRODUCTION OF SAMPLE DATA SETSIn this research, we collected 45 steel plates including pro-cessed steel plates and industrial field steel plates. The steelplates, containing cracks of different types of shapes andsizes, have different damage types. There are lots of rusts,impurities, and corrosion on the surface of the steel sheetunder natural conditions, which makes it almost impossi-ble to detect cracks visually. 22 cracked steel plates werecollected in natural conditions on the construction site. Forother samples, chemical corrosion, oxidation, covering thesurface of the soil and other ways were used to increasethe difficulty of identifying cracks in steel plates. We artifi-cially created cracks at different depths on some steel plates.In addition, smooth-surfaceed steel plates are sprayed black

paint to increase the emissivity. We use the horizontal heattransfer method proposed to thermally excite the surface ofthe steel plate. A rotating electric heating rod is employedto generate stable thermal excitation from the left or rightside of the crack. The experiments of temperature analysisshown that the phenomenon of temperature anomalies is mostobvious at the moment of excitation completion. After a lotof experiments, we found that 10 seconds of excitation timeis enough, then we take the thermal image at this momentas a sample set. Furthermore, in behalf of increasing therobustness of the model in our task, we use different ther-mal excitation directions, different thermal excitation times,different indoor environments, etc. to collect thermal imageswith a wide range of variations. In this paper, the disparityin temperature gradient on the two sides of the defect isutilized to be the detection basis of crack type. The grayscaleimages can describe the difference well, so it is also efficientcompared with the color-coded images and achieve high

182070 VOLUME 7, 2019


FIGURE 16. Original Faster R-CNN model detection example.

FIGURE 17. Comparison of feature maps using the original Faster R-CNN and the improved Faster R-CNN. The first rowshows the original pictures, the second row shows the feature map after using the original Faster R-CNN, where the crackcharacteristics are roughly shown. The third row indicates the feature map after using our improved Faster R-CNN network,from which the crack can be seen more clearly.

accuracy [43]. In addition, grayscale images are benefit forspeeding up model training [43], [44]. The images used fortraining in this paper are all grayscale images. Some examplesof data sets are shown in Figure 15.

These collected thermal images are chosen to build a data-bank. The types and locations of cracks in the infrared thermalimage are annotated for training and testing of the targetdetection model. At first, a total of 1125 infrared thermalimages are collected, 1000 among them are used for modeltraining. Data augmentation plays a good role in improvingthe performance of deep learning and can reduce overfit-ting [19]. Then, in light of this, image expansion is performed

on the 1000 original images by mirror flip, rotation, randomcropping and color illumination conversion. By doing so,data-enhanced image data set totals 3000 images. We man-ually labeled the 3000 image using a tool called ‘‘labelImg’’,some labeled images are shown in Figure 15. Among theseimages, some images contain a combination of two differentdamages, totaling 1562 penetrating cracks, 1251 scratches,and 937 non-penetrating cracks. These images are used astraining data sets. Among the additional 125 images men-tioned above, the number of penetrating cracks is 65, thenumber of non-penetrating cracks is 39, and 53 shallowsurface scratches are included. It should be noted that our

VOLUME 7, 2019 182071


FIGURE 18. Improved Faster R-CNN model detection examples. The thermal images marked as red star, yellow star, andblue star are the same as (a), (b), (c) in Figure 16, respectively. But, they achieve accurate location and classification.

deep learning model trained and validated on 3000 images.Among the 3000 images, we randomly choose 80% of thethree damage type images as the training sets, the rest 20%as the validation set, and then evaluated on the 125 imagesoutside of the 3000 images to perform the robustness. Thesimulation hardware environment of this experiment is IntelCore CPU i5-8400. GPU is NVIDIA GTX 1070ti, and thememory is 8GB. The algorithm is implemented using Pythonin conjunction with the TensorFlow framework.

2) TESTING RESULTS OF ORIGINAL FASTER R-CNNThe original Faster R-CNN uses only the last layer of fea-ture maps, which makes the loss of target information moreserious. For infrared image samples with low contrast andsmall crack defects, it is easy to cause false detection and

missed detection. The detection of partially missed detection,misdetection, and poor positioning is shown in Figure 16.Figure 16 (a) is the case of missed inspection. Figure 16 (b) isthe case of false detection. The reason is that crack is smalland the image quality is poor. Figure 16 (c) indicates the poorpositioning accuracy, which may be because the selection ofthe anchor frame is not suitable. Although the network hasundergone two border regression corrections, the quality ofthe candidate frames extracted from the original network ispoor. There is still some deviation after the correction.

3) TESTING RESULTS OF IMPROVED FASTER R-CNNThe testing results using the improved Faster R-CNN areshown in Figure 18. It can be seen from the figure that themethod can better detect the target cracks of various sizes

182072 VOLUME 7, 2019


TABLE 3. Different combinations.

TABLE 4. Improved faster R-CNN detection ratio statistics.

TABLE 5. Improved faster R-CNN detection ratio statistics.

and shapes. For some samples of the original Faster R-CNNnetwork false detection and missed detection, the improvedFaster R-CNN network can still be detected better, andthe target positioning accuracy is higher than the originalalgorithm.

The AP (Average Precision) is the area under the Preci-sion/Recall curve. Notably, the Precision = TP/(TP+FP)and Recall = TP/(TP+FN), where the TP represents truepositives, FP represents false positives and FN representsfalse negatives. The mean AP (mAP) was widely usedin [32], [33], [45].Whenwe use different methods to computeAP on our data set, the AP can be easily used to com-pare the strengths and weaknesses of different methods. So,in this study, we employed mAP and AP as the evaluationcriteria.

Table 3 shows that by comparing the prediction accuracyof different fusion layers, the fusion method adopted in thisstudy has the best performance. Experiments are carried outon the 125 test images mentioned above using the FasterR-CNN model of this paper. The test results are shownin Table 4 and Table 5. Tables 4 and 5 show that the threetypes of crack defects all have good detection results throughthe network, and the detection accuracy of penetrating cracks,shallow surface scratches, and non-penetrating cracks are95.38%, 94.87%, and 96.27%, respectively. Besides, the testcan be seen that the detection rate of missed detection andfalse detection rate are also relatively low. For the partialmissed inspection and false detection pictures, the reasonmaybe due to the poor quality of the infrared image and the morenoise, which leads to misjudgment of individual samples.

VOLUME 7, 2019 182073


FIGURE 19. The comparison between R-CNN and traditional methods. The improved R-CNN can not only accurately locatethe crack, but also distinguish the crack types without being affected by noise.

FIGURE 20. The comparison among the state-of-the-art detection methods.

Based on the detection results of the three targets, the averagedetection accuracy of the improved algorithm is 95.54%, andthe mAP is 92.41%. The average false detection rate and themissed detection rate are relatively low, and the positioningaccuracy has also been greatly improved.

4) ALGORITHM DETECTION RESULT COMPARISONFor further showing the advantage of our improved FasterR-CNN compared with traditional and other state-of-the-artdetection methods. Sobel, Canny, HOG-SVM, Yolov3 meth-ods are utilized. In Figure 19, the crack is accurately detectedby the improved Faster R-CNN. As is shown in the firstrow, the Sobel and Canny methods provide some informationabout the crack, but it is easy to get mixed up with the noise,so it is still hard to recognize the crack. The comparisonbetween these three methods is more obvious in the images inthe second and third rows, where we can not figure out eventhere is a crack both using Sobel or Canny method. Besides,the bigger drawback is that Sobel and Canny detection meth-ods cannot identify the location and type of cracks accurately.In Figure 19, it shows that our improved Faster R-CNN

detects the crack accurately. Penetrating cracks are mistakenfor non-penetrating cracks by Yolov3 [45], the HOG-SVMmethod mistakes the normal area as a crack. The Yolov3 iswell known as it’s fast detection speed. Joseph manifestedin his research, when the IOU is set between 0.5 and 0.95,it cannot get as well results as Faster R-CNN and will causelow accuracy [45], [40]. In the HOG-SVM method, HOGcollects the gradient or edge direction histogram of each pixelto form a feature descriptor and put it into the SVM classifierfor training [46], it is very sensitive to noise and misidentifiesthe noise area as a crack, so the detection accuracy is low.Also, the detection time of HOG-SVM is too long and takesup a lot of memory.

Yolov3 is a one-stage network that directly extracts thecrack features to predict object categories and their loca-tions. The Faster R-CNN is a two-stage network that inte-grates feature extraction and position regression into the samenetwork, then generates candidate regions and continues tomake detailed predictions. Therefore, in essence, Faster R-CNN has a better performance. However, the original FasterR-CNN directly acquires the features from the last layer

182074 VOLUME 7, 2019


TABLE 6. Algorithm detection result comparison.

of VGG-16, which loses more detailed information, andthe anchor frame size cannot match the crack well, so thedetection effect is poor. After the feature fusion and anchorframe selection proposed in this paper, the detection effecthas been greatly improved. As shown in Figure 17, it iseasier to identify cracks from the feature map after fusion.The state-of-the-art detection algorithm comparison resultsare shown in Table 6. It can be seen from the compari-son of the final average detection accuracy that the aver-age detection accuracy and Average Precision (AP) of theimproved Faster R-CNN on the test set is the highest, andhigher than the original algorithm by 3.18% and 1.88%,respectively. Because our improved Faster R-CNN uses moretypes of anchors, the calculation time is inevitable longer.For safety testing in track detection, we have a higherdemand for accuracy rather than time-consumption, so ourimproved Faster R-CNN would be the best choice for thatpurpose.

V. CONCLUSIONTraditional vision-based methods cannot discover steelcracks covered by pollutants or inside, while the thermalimager alone cannot intelligently locate cracks and assesscrack types. In this study, the feasibility of thermal imag-ing crack detection based on CNN is verified by theoreticalanalysis combined with some specific experiments. Cracksof different depths can be identified so that the inspectionpersonnel can inspect the steel plate according to the cracksituation. (e.g., the steel plate with cracks on the surface doesnot need to be repaired). Besides, cracks can be located accu-rately, which helps to find the right location for inspection andmaintenance. Given the ordinary infrared thermal excitationdevices are expensive and have poor mobility, we designeda rolling electric heating rod, which has low cost and is con-venient to carry. Combining our rolling electric heating rodwith a horizontal thermal excitation method, the interferenceof abnormal emissivity can be suppressed, and the accuratetemperature gradient on the side of the crack is generated.Through the temperature analysis experiment, we perceivehow the temperature change, and this rule is quantified asthe image gradient on both sides of the crack, which can belearned by R-CNN.

Conventional edge detection methods are susceptible tonoise and it is difficult to distinguish between cracks andcrack-free areas. The HOG-SVM method also sensitive tonoise when extracting features with HOG, causing false pos-itives. The YOLO v3 uses a one-stage approach that losesdetection accuracy when achieving high detection speeds.The original Faster R-CNN extracts coarse features directlyand lacks appropriate anchor boxes, resulting in cracks notbeing detected well. Via combining features from the deepand shallow layers, choosing appropriate anchor frames forRPN, our improved Faster R-CNN can better learn crack fea-tures and improve image scale invariance, which enhances thedetection effect of small size cracks. The method can classifyand locate cracks of different depths better. In the comparisonexperiments, the mean detection accuracy and AP using ourimproved Faster R-CNN method are 95.54% and 92.41%,respectively, and are better than the traditional and someother state-of-the-art methods. Nonetheless, despite using thefeature fusion, the information of pictures will inevitably belost during the pooling process. In the future, we will furtheroptimize our network to reduce false detection and misseddetection.

This paper incorporates the computer vision algorithmsinto the traditional non-destructive testing, which providesa more intelligent, efficient and accurate method for steelplate crack detection. However, there are some drawbacks inthis method. Placing the heat source at the proper locationis important, for example, as shown in Figure 2(b), heatingthe scratch directly will cause the scratch to be misidenti-fied as a true crack. Meanwhile, if the approximate positionof the crack is not known in advance, it is necessary toperform overall inspection of the steel sheet. The heatingdevice designed in this study has high energy consumption,can not automatically control steel plate temperature and isdifficult to heat steel plates with complex shapes. In addition,the collection and crack identification are divided into twosteps. In the future, we will improve the structure and heatingperformance of our heating device, more research on crackswith various morphological distributions and more heatingmethods. Also, thermal image collection software can beindependently developed to realize real-time collection anddetection.

VOLUME 7, 2019 182075


REFERENCES

[1] M.-X. Chu, A.-N. Wang, R.-F. Gong, and M. Sha, ‘‘Multi-class classifi-cation methods of enhanced LS-TWSVM for strip steel surface defects,’’J. Iron Steel Res. Int., vol. 21, no. 2, pp. 174–180, 2014.

[2] H. Jia, Y. L. Murphey, J. Shi, and T.-S. Chang, ‘‘An intelligent real-timevision system for surface defect detection,’’ in Proc. 17th Int. Conf. PatternRecognit. (ICPR), vol. 3, 2004, pp. 239–242.

[3] Y. Chen, L. Chen, X. Liu, S. Ding, and H. Zhang, ‘‘Real-time steelinspection system based on support vector machine and multiple ker-nel learning,’’ in Practical Applications of Intelligent Systems. Berlin,Germany: Springer, 2011, pp. 185–190.

[4] E. Amid, S. R. Aghdam, and H. Amindavar, ‘‘Enhanced performance forsupport vector machines as multi-class classifiers in steel surface defectdetection,’’ World Acad. Sci., Eng. Technol., vol. 6, no. 7, pp. 1096–1100,2012.

[5] R. Gong, M. Chu, Y. Yang, and Y. Feng, ‘‘A multi-class classifier based onsupport vector hyper-spheres for steel plate surface defects,’’ Chemomet-rics Intell. Lab. Syst., vol. 188, pp. 70–78, May 2019.

[6] S. Dorafshan, M. Maguire, and W. Collins, ‘‘Infrared thermography forweld inspection: Feasibility and application,’’ Infrastructures, vol. 3, no. 4,p. 45, 2018.

[7] P. Zhu, Y. Cheng, P. Banerjee, A. Tamburrino, and Y. Deng, ‘‘A novelmachine learning model for eddy current testing with uncertainty,’’NDT&E Int., vol. 101, pp. 104–112, Jan. 2019.

[8] A. Bernieri, L. Ferrigno, M. Laracca, and M. Molinara, ‘‘Crack shapereconstruction in eddy current testing using machine learning sys-tems for regression,’’ IEEE Trans. Instrum. Meas., vol. 57, no. 9,pp. 1958–1968, Sep. 2008.

[9] C. Maierhofer and M. Röllig, ‘‘Active thermography for the characteriza-tion of surfaces and interfaces of historic masonry structures,’’ in Proc.7th Int. Symp. Non-Destructive Test. Civil Eng. (NDTCE), Nantes, France,vol. 30, 2009, pp. 1–6.

[10] S. Bagavathiappan, B. B. Lahiri, T. Saravanan, J. Philip, andT. Jayakumar, ‘‘Infrared thermography for condition monitoring—A review,’’ Infr. Phys. Technol., vol. 60, pp. 35–55, Sep. 2013.

[11] B. Lahiri, S. Bagavathiappan, L. T. Sebastian, J. Philip, and T. Jayakumar,‘‘Effect of non-magnetic inclusions inmagnetic specimens on defect detec-tion sensitivity using active infrared thermography,’’ Infr. Phys. Technol.,vol. 68, pp. 52–60, Jan. 2015.

[12] S. Dorafshan, M. Maguire, N. V. Hoffer, C. Coopmans, and R. J. Thomas,‘‘Unmanned aerial vehicle augmented bridge inspection feasibility study,’’Dept. Elect. Comput. Eng., Utah State Univ., Logan, UT, USA,Tech. Rep. Unmanned Aerial Vehicle Augmented Bridge Inspection Fea-sibility Study (No. CAIT-UTC-NC31), 2017.

[13] M. Rodríguez-Martin, S. Lagüela, D. González-Aguilera, and P. Arias,‘‘Cooling analysis of welded materials for crack detection using infraredthermography,’’ Infr. Phys. Technol., vol. 67, pp. 547–554, Nov. 2014.

[14] M. Rodríguez-Martín, S. Lagüela, D. González-Aguilera, and J. Martinez,‘‘Prediction of depth model for cracks in steel using infrared thermogra-phy,’’ Infr. Phys. Technol., vol. 71, pp. 492–500, Jul. 2015.

[15] E. Barreira and V. P. de Freitas, ‘‘Evaluation of building materialsusing infrared thermography,’’ Construct. Building Mater., vol. 21, no. 1,pp. 218–224, 2007.

[16] N. Pech-May, A. Oleaga, A. Mendioroz, and A. Salazar, ‘‘Fast char-acterization of the width of vertical cracks using pulsed laser spotinfrared thermography,’’ J. Nondestruct. Eval., vol. 35, no. 2, p. 22,2016.

[17] T. Pahlberg, M. Thurley, D. Popovic, and O. Hagman, ‘‘Crack detection inoak flooring lamellae using ultrasound-excited thermography,’’ Infr. Phys.Technol., vol. 88, pp. 57–69, Jan. 2018.

[18] S. Beuve, Z. Qin, J.-P. Roger, S. Holé, and C. Boué, ‘‘Open cracks depthsizing bymulti-frequency laser stimulated lock-in thermography combinedwith image processing,’’ Sens. Actuators A, Phys., vol. 247, pp. 494–503,Aug. 2016.

[19] X. Li, B. Gao, W. L. Woo, G. Y. Tian, X. Qiu, and L. Gu, ‘‘Quantitativesurface crack evaluation based on eddy current pulsed thermography,’’IEEE Sensors J., vol. 17, no. 2, pp. 412–421, Jan. 2017.

[20] R. Ali and Y.-J. Cha, ‘‘Subsurface damage detection of a steel bridge usingdeep learning and uncooledmicro-bolometer,’’Construct. BuildingMater.,vol. 226, pp. 376–387, Nov. 2019.

[21] Y. LeCun, Y. Bengio, and G. Hinton, ‘‘Deep learning,’’ Nature, vol. 521,no. 7553, p. 436, 2015.

[22] D. Kang and Y.-J. Cha, ‘‘Autonomous UAVs for structural healthmonitoring using deep learning and an ultrasonic beacon system withgeo-tagging,’’ Comput. Aided Civil Infrastruct. Eng., vol. 33, no. 10,pp. 885–902, Oct. 2018.

[23] D. Soukup and R. Huber-Mörk, ‘‘Convolutional neural networks forsteel surface defect detection from photometric stereo images,’’ inProc. Int. Symp. Vis. Comput. Cham, Switzerland: Springer, 2014,pp. 668–677.

[24] Y.-J. Cha, W. Choi, G. Suh, S. Mahmoudkhani, and O. Büyüköztürk,‘‘Autonomous structural visual inspection using region-based deep learn-ing for detecting multiple damage types,’’Comput.-Aided Civil Infrastruct.Eng., vol. 33, no. 9, pp. 731–747, 2018.

[25] T. Kong, A. Yao, Y. Chen, and F. Sun, ‘‘HyperNet: Towards accurateregion proposal generation and joint object detection,’’ in Proc. IEEEConf.Comput. Vis. Pattern Recognit., Jun. 2016, pp. 845–853.

[26] S. Dorafshan, R. J. Thomas, and M. Maguire, ‘‘Fatigue crack detectionusing unmanned aerial systems in fracture critical inspection of steelbridges,’’ J. Bridge Eng., vol. 23, no. 10, 2018, Art. no. 04018078.

[27] S. Dorafshan, M. Maguire, N. V. Hoffer, and C. Coopmans, Fatigue CrackDetection Using Unmanned Aerial Systems in Under-Bridge Inspection.Boise, ID, USA: Idaho Transportation Department, 2017.

[28] J. Staggs and H. Phylaktou, ‘‘The effects of emissivity on the performanceof steel in furnace tests,’’ Fire Saf. J., vol. 43, no. 1, pp. 1–10,2008.

[29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘Imagenet classificationwith deep convolutional neural networks,’’ in Proc. Adv. Neural Inf. Pro-cess. Syst., 2012, pp. 1097–1105.

[30] R. Girshick, J. Donahue, T. Darrell, and J. Malik, ‘‘Rich feature hierarchiesfor accurate object detection and semantic segmentation,’’ in Proc. IEEEConf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 580–587.

[31] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Spatial pyramid pooling in deepconvolutional networks for visual recognition,’’ IEEE Trans. Pattern Anal.Mach. Intell., vol. 37, no. 9, pp. 1904–1916, Sep. 2015.

[32] R. Girshick, ‘‘Fast R-CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis.,Dec. 2015, pp. 1440–1448.

[33] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real-timeobject detection with region proposal networks,’’ in Proc. Adv. Neural Inf.Process. Syst., 2015, pp. 91–99.

[34] E. Rezende, G. Ruppert, T. Carvalho, A. Theophilo, F. Ramos, andP. de Geus, ‘‘Malicious software classification using VGG16 deep neuralnetwork’s bottleneck features,’’ in Information Technology-New Genera-tions. Cham, Switzerland: Springer, 2018, pp. 51–59.

[35] S. J. Pan andQ. Yang, ‘‘A survey on transfer learning,’’ IEEE Trans. Knowl.Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010.

[36] X. Sun, P. Wu, and S. C. H. Hoi, ‘‘Face detection using deep learn-ing: An improved faster RCNN approach,’’ Neurocomputing, vol. 299,pp. 42–50, Jul. 2018.

[37] S. Bell, C. L. Zitnick, K. Bala, and R. Girshick, ‘‘Inside-outside net:Detecting objects in context with skip pooling and recurrent neural net-works,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2016,pp. 2874–2883.

[38] J. Long, E. Shelhamer, and T. Darrell, ‘‘Fully convolutional networksfor semantic segmentation,’’ in Proc. IEEE Conf. Comput. Vis. PatternRecognit., Jun. 2015, pp. 3431–3440.

[39] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,S. Guadarrama, and T. Darrell, ‘‘Caffe: Convolutional architecture for fastfeature embedding,’’ in Proc. 22nd ACM Int. Conf. Multimedia, 2014,pp. 675–678.

[40] S. Zhang, R. Wu, K. Xu, J. Wang, and W. Sun, ‘‘R-CNN-based shipdetection from high resolution remote sensing imagery,’’ Remote Sens.,vol. 11, no. 6, p. 631, 2019.

[41] M. Riedmiller and H. Braun, ‘‘A direct adaptive method for fasterbackpropagation learning: The RPROP algorithm,’’ in Proc. IEEEInt. Conf. Neural Netw., San Francisco, CA, USA, Mar./Apr. 1993,pp. 586–591.

[42] D. Masters and C. Luschi, ‘‘Revisiting small batch training fordeep neural networks,’’ 2018, arXiv:1804.07612. [Online]. Available:https://arxiv.org/abs/1804.07612

[43] H. M. Bui, M. Lech, E. Cheng, K. Neville, and I. S. Burnett, ‘‘Usinggrayscale images for object recognition with convolutional-recursive neu-ral network,’’ in Proc. IEEE 6th Int. Conf. Commun. Electron. (ICCE),Jul. 2016, pp. 321–325.

[44] A. M. Naser, ‘‘Color to grayscale image conversion based dimensional-ity reduction with stationary wavelet transform,’’ in Proc. Al-Sadeq Int.Conf. Multidisciplinary IT Commun. Sci. Appl. (AIC-MITCSA), 2016,pp. 1–5.

[45] J. Redmon and A. Farhadi, ‘‘Yolov3: An incremental improvement,’’ 2018,arXiv:1804.02767. [Online]. Available: https://arxiv.org/abs/1804.02767

[46] D. F. Llorca, R. Arroyo, and M.-A. Sotelo, ‘‘Vehicle logo recognition intraffic images using HOG features and SVM,’’ in Proc. 16th Int. IEEEConf. Intell. Transp. Syst. (ITSC), Oct. 2013, pp. 2229–2234.

182076 VOLUME 7, 2019


JUN YANG is currently pursuing the M.S. degreewith the School ofMechanical and Electrical Engi-neering, China Jiliang University. He is currentlya Visiting Scholar with the Department of Math-ematics, Purdue University, West Lafayette, IN,USA. His research interests include deep learning,image processing, and computer vision.

WEI WANG received the M.S. degree from ChinaJiliang University. He is a machine learning algo-rithm engineer. His research interests includemachine learning, image processing, and softwaredevelopment.

GUANG LIN (M’12) received the B.S. degreein mechanics with a minor in electrical engineer-ing from Zhejiang University, China, in 1997,the M.S. degree in mechanics and engineeringscience from Peking University, China, in 2000,and the M.S. and Ph.D. degrees in applied math-ematics from Brown University, Providence, RI,USA, in 2004 and 2007, respectively. He was aSenior Research Scientist with the Pacific North-west National Laboratory, from 2008 to 2014.

He is currently theDirector of Data Science Consulting Service, and anAsso-ciate Professor with the Department of Mathematics, School of MechanicalEngineering, Purdue University. He has had in-depth involvement in devel-oping deep learning and uncertainty quantification tools for a large varietyof domains including energy and environment. His research interests includediverse topics in computational science both on algorithms and applications,uncertainty quantification, large-scale data analysis, and multiscale model-ing in a large variety of domains. He is currently an Associate Editor ofSociety for Industrial and Applied Mathematics and Multiscale Modelingand Simulations.

QING LI was the Dean of the School of Mechani-cal and Electrical Engineering, China Jiliang Uni-versity. He was selected as a Famous Teacherof the National Ten-Thousand Talents Program,in 2017. He is currently a Professor with ChinaJiliang University. He is the Executive Directorof the China’s Metrological Testing Association.His current research interests include dynamicmeasurement and control, and sensing technology.He has presided over the completion of more than60 science and technology projects.

YEQING SUN received the M.S. degree fromChina Jiliang University. He is currently an imageprocessing algorithm engineer. His research inter-ests include machine learning, data mining, andsoftware development.

YIXUAN SUN received the B.Sc. degree in energyand power engineering and financial mathematicsfrom Shandong University, China, in 2016, andthe M.Sc. degree in mechanical engineering, Pur-due University, West Lafayette, IN, USA, in 2018,where he is currently pursuing the Ph.D. degreein mechanical engineering. His research interestsinclude machine learning, data mining, and patternrecognition, particularly the applications of deeplearning in addressing engineering problems.

VOLUME 7, 2019 182077

Documents

Infrared Thermal Imaging-Based Crack Detection Using Deep