13
Automatic bearing fault diagnosis using particle swarm clustering and Hidden Markov Model Mitchell Yuwono a,n , Yong Qin c , Jing Zhou a , Ying Guo b , Branko G. Celler b , Steven W. Su a a Faculty of Engineering and Information Technology, University of Technology, Sydney (UTS), 15 Broadway, Ultimo, NSW 2007, Australia b The Commonwealth Scientic and Industrial Research Organisation (CSIRO), Division of Computational Informatics, Marseld, NSW 2122, Australia c State Key Lab of Rail Trafc Control and Safety, Beijing Jiaotong University, No. 3, Shang Yuan Cun, Beijing 100044, PR China article info Keywords: Fault detection and diagnosis Rolling bearing defect diagnosis Data clustering Hidden Markov Model Wavelet kurtogram Cepstral analysis abstract Ball bearings are integral elements in most rotating manufacturing machineries. While detecting defective bearing is relatively straightforward, discovering the source of defect requires advanced signal processing techniques. This paper proposes an automatic bearing defect diagnosis method based on Swarm Rapid Centroid Estimation (SRCE) and Hidden Markov Model (HMM). Using the defect frequency signatures extracted with Wavelet Kurtogram and Cepstral Liftering, SRCE þHMM achieved on average the sensitivity, specicity, and error rate of 98.02%, 96.03%, and 2.65%, respectively, on the bearing fault vibration data provided by Case School of Engineering of the Case Western Reserve University (CSE) which warrants further investigation. & 2015 Elsevier Ltd. All rights reserved. 1. Introduction Fault detection and diagnosis (FDD) plays an important role in process engineering (Venkatasubramanian et al., 2003). Early detection of faults while a plant is still operating in a control- lable region can help to avoid abnormal event progression, minimize productivity loss, as well as improve stability of manufacturing processes and the quality of end products (Venkatasubramanian et al., 2003; Huang et al., 2009). Indus- tries have generally acknowledged the importance of FDD (Huang et al., 2009; Venkatasubramanian et al., 2003; Guo et al., 2013; Wall et al., 2011). For example, petrochemical industries estimated an annual loss of 20 billion dollars attrib- uted to faults alone and have therefore put fault management as critical priority (Venkatasubramanian et al., 2003). Semicon- ductor and TFT-LCD factories employ periodic sampling to monitor the stability of manufacturing processes (Huang et al., 2009). Scientists develop statistical machine learning model for automatic FDD in Heating Ventilation and Air Conditioning (HVAC) systems (Wall et al., 2011). Considerable interest has therefore been expressed in this eld from both industrial practitioners and academic researchers (Venkatasubramanian et al., 2003; Yuwono et al., 2013a). Bearings play a critical role especially in modern machineries, power generators, motor vehicles, trains, industrial robots, manufacturing machines, mining equipments, heavy vehicles, construction cranes, and general purpose electro-mechanical machines (Slocum, 2008). Newer inventions often require the need for extreme precisions, greater capacities, and faster rota- tions which makes maintaining healthy bearings increasingly important. Poor operating environments, particularly moist or contaminated areas and improper handling practices often give rise to premature bearing failures which would shorten the lifetime of the corresponding machine and ultimately impair the robustness of product quality (Publications, 2007). Bearings are most commonly associated as a supporting element in rotating manufacturing machineries such as in conveyer belts. This type of bearing is known as contact bearing, as mechanical contact exists between the load and the bearings. Contact bearings, an area of focus in this paper, have developed extensively from their early use in bicycles to construction cranes. More specically, we will focus on deep- groove ball (roller) bearings, also known as Conrad ball bear- ings, with the primary goal of detecting faults in the following compositions of the bearings: outer race, inner race and the ball itself (Randall and Antoni, 2011). A Conrad ball bearing is designed to support radial or bi- directional axial loads. Faults commonly found in this type of bearings include outer race, inner race and ball/rolling element Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/engappai Engineering Applications of Articial Intelligence http://dx.doi.org/10.1016/j.engappai.2015.03.007 0952-1976/& 2015 Elsevier Ltd. All rights reserved. n Corresponding author. E-mail addresses: [email protected] (M. Yuwono), [email protected] (Y. Qin), [email protected] (J. Zhou), [email protected] (Y. Guo), [email protected] (B.G. Celler), [email protected] (S.W. Su). Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and Hidden Markov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i Engineering Applications of Articial Intelligence (∎∎∎∎) ∎∎∎∎∎∎

YuwonoRollingBearing

Embed Size (px)

Citation preview

Page 1: YuwonoRollingBearing

Automatic bearing fault diagnosis using particle swarm clusteringand Hidden Markov Model

Mitchell Yuwono a,n, Yong Qin c, Jing Zhou a, Ying Guo b, Branko G. Celler b, Steven W. Su a

a Faculty of Engineering and Information Technology, University of Technology, Sydney (UTS), 15 Broadway, Ultimo, NSW 2007, Australiab The Commonwealth Scientific and Industrial Research Organisation (CSIRO), Division of Computational Informatics, Marsfield, NSW 2122, Australiac State Key Lab of Rail Traffic Control and Safety, Beijing Jiaotong University, No. 3, Shang Yuan Cun, Beijing 100044, PR China

a r t i c l e i n f o

Keywords:Fault detection and diagnosisRolling bearing defect diagnosisData clusteringHidden Markov ModelWavelet kurtogramCepstral analysis

a b s t r a c t

Ball bearings are integral elements in most rotating manufacturing machineries. While detectingdefective bearing is relatively straightforward, discovering the source of defect requires advancedsignal processing techniques. This paper proposes an automatic bearing defect diagnosis methodbased on Swarm Rapid Centroid Estimation (SRCE) and Hidden Markov Model (HMM). Using thedefect frequency signatures extracted with Wavelet Kurtogram and Cepstral Liftering, SRCEþHMMachieved on average the sensitivity, specificity, and error rate of 98.02%, 96.03%, and 2.65%,respectively, on the bearing fault vibration data provided by Case School of Engineering of the CaseWestern Reserve University (CSE) which warrants further investigation.

& 2015 Elsevier Ltd. All rights reserved.

1. Introduction

Fault detection and diagnosis (FDD) plays an important rolein process engineering (Venkatasubramanian et al., 2003). Earlydetection of faults while a plant is still operating in a control-lable region can help to avoid abnormal event progression,minimize productivity loss, as well as improve stability ofmanufacturing processes and the quality of end products(Venkatasubramanian et al., 2003; Huang et al., 2009). Indus-tries have generally acknowledged the importance of FDD(Huang et al., 2009; Venkatasubramanian et al., 2003; Guoet al., 2013; Wall et al., 2011). For example, petrochemicalindustries estimated an annual loss of 20 billion dollars attrib-uted to faults alone and have therefore put fault management ascritical priority (Venkatasubramanian et al., 2003). Semicon-ductor and TFT-LCD factories employ periodic sampling tomonitor the stability of manufacturing processes (Huang et al.,2009). Scientists develop statistical machine learning model forautomatic FDD in Heating Ventilation and Air Conditioning(HVAC) systems (Wall et al., 2011). Considerable interest hastherefore been expressed in this field from both industrial

practitioners and academic researchers (Venkatasubramanianet al., 2003; Yuwono et al., 2013a).

Bearings play a critical role especially in modern machineries,power generators, motor vehicles, trains, industrial robots,manufacturing machines, mining equipments, heavy vehicles,construction cranes, and general purpose electro-mechanicalmachines (Slocum, 2008). Newer inventions often require theneed for extreme precisions, greater capacities, and faster rota-tions which makes maintaining healthy bearings increasinglyimportant. Poor operating environments, particularly moist orcontaminated areas and improper handling practices often giverise to premature bearing failures which would shorten thelifetime of the corresponding machine and ultimately impairthe robustness of product quality (Publications, 2007).

Bearings are most commonly associated as a supportingelement in rotating manufacturing machineries such as inconveyer belts. This type of bearing is known as contactbearing, as mechanical contact exists between the load andthe bearings. Contact bearings, an area of focus in this paper,have developed extensively from their early use in bicycles toconstruction cranes. More specifically, we will focus on deep-groove ball (roller) bearings, also known as Conrad ball bear-ings, with the primary goal of detecting faults in the followingcompositions of the bearings: outer race, inner race and theball itself (Randall and Antoni, 2011).

A Conrad ball bearing is designed to support radial or bi-directional axial loads. Faults commonly found in this type ofbearings include outer race, inner race and ball/rolling element

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/engappai

Engineering Applications of Artificial Intelligence

http://dx.doi.org/10.1016/j.engappai.2015.03.0070952-1976/& 2015 Elsevier Ltd. All rights reserved.

n Corresponding author.E-mail addresses: [email protected] (M. Yuwono),

[email protected] (Y. Qin), [email protected] (J. Zhou),[email protected] (Y. Guo), [email protected] (B.G. Celler),[email protected] (S.W. Su).

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Page 2: YuwonoRollingBearing

faults (Randall and Antoni, 2011; Li and Wen, 2014). In order toidentify these faults, time and frequency domain analysis ofvibration signals along with clustering technique is employed.Prior studies (Kulkarni and Sahasrabudhe, 2013; Fang and Zijie,2007; Randall and Antoni, 2011) have revealed that the classi-cal signal processing techniques such as the Fast FourierTransform (FFT) has succeeded in terms of analyzing frequen-cies, but its discrete nature poses a significant challengein capturing the rather aperiodic and finite signals observedin practice (Randall and Antoni, 2011). Another difficulty inapplying the FFT occurs with the existence of noise over signal.This issue leads to the use of Wavelet Transform (WT) (Fangand Zijie, 2007; Kulkarni and Sahasrabudhe, 2013; Sawalhi andRandall, 2005) in extracting weaker signals due to its capabilityto handle frequency transients.

The essential signal processing guidelines for the rolling be-aring fault diagnosis are well established (Randall and Antoni,2011; Fang and Zijie, 2007; Kulkarni and Sahasrabudhe, 2013;Sawalhi and Randall, 2005; Randall and Hee, 1981). Fang andZijie (2007) observe a distinctive wavelet energy pattern invarious bearing faults. Kulkarni and Sahasrabudhe (2013) dis-cover that fault frequency signatures can be isolated, denoised

and monitored using WT. Sawalhi and Randall (2005) showthat the resonance band can be estimated using WaveletKurtogram. Randall points out that multiple faults may be welldiscerned in the envelope cepstral domain given proper demo-dulation (Randall and Hee, 1981).

In this paper we are interested in augmenting the availablesignal processing technique with swarm intelligence and Marko-vian probabilistic framework. This paper contributes a novelautomated method for detection and diagnosis of defects usingSwarm Rapid Centroid Estimation (SRCE) (Yuwono et al., 2013a,b,2014) and Hidden Markov Model (HMM) (Guo et al., 2012, 2013;Zoubin, 2001). The algorithm uses the (continuous) waveletkurtogram (Randall and Antoni, 2011; Lei et al., 2011; ValeriuVrabie and Pierre Granjon, 2003) and cepstral liftering (Randalland Hee, 1981) as the feature extraction method. The proposedmethod is tested against an openly available bearing fault datasetpublished by the Case School of Engineering of the Case WesternReserve University (CSE) (Case Western Reserve, 2014). The blockdiagram of the method can be seen in Fig. 1.

The rest of the document is structured as follows. Section 2gives a general overview of the vibrational behavior of a rollingbearing system under fault. Section 3 gives a detailed explanation

Fig. 1. Block diagram of the proposed bearing defect diagnosis system.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎2

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 3: YuwonoRollingBearing

of the proposed method. Section 4 summarizes the data used forthe experiment, as well as the result and discussion. FinallySection 5 concludes the paper.

2. Overview

2.1. Rolling bearing defects diagnostics

A rolling bearing with local faults exposes impact forces whichcause periodic impulsive vibrations (Randall and Antoni, 2011; Liet al., 2000). Based on the findings of Randall and Antoni (2011),the hypothesized vibration model of a rolling bearing with localfaults can be generalized as follows:

f ðtÞ ¼ hðtÞn xðtÞ 1þXNn ¼ 1

X1k ¼ �1

ynðtÞδðt�kTnÞ !

þηðtÞ" #

; ð1Þ

where

� n denotes the convolution operator.� x(t) denotes the periodic acceleration response due to shaftrotation, gear, misalignment, and eccentric fault. x(t) alwaysexists and should be removed using an autoregressive filter asit may mask the weak bearing impact signals (Randall andAntoni, 2011).

� h(t) denotes the “transfer path”, the impulse response of thestructure which modulates the bearing signal to higher fre-quencies (i.e. the resonance band).

� n denotes the index of the fault which is usually indexed from1 up to N¼3 as there are generally three types of faults in arolling bearing system (Randall and Antoni, 2011).

� yn(t) denotes the strength of the impact force due to thenth fault.

� δðt�kTnÞ denotes a periodic impulse train, where Tn denotesthe duration between impacts due to the nth fault – a multipleof the shaft angular frequency.

� ηðtÞ denotes an additive stochastic noise function.

Defects in rolling bearing can be generalized into three types:(a) outer race fault; (b) inner race fault; and (c) rolling elementfault; each characterized by periodic impacts of distinctive rate,normally distributed around a specific “defect frequency” sum-marized in Table 1.

These frequencies are derived from the physical movements ofthe bearing components which have been discussed in Li et al.(2000). A diagram of a deep-grove ball bearing can be seen inFig. 2. The measured vibration signals at various bearing condi-tions can be seen in Fig. 3.

Extracting these frequencies and their harmonics from thevibration signals is a challenging task as they are often buried inother harmonics due to the fact that ‘noise’ from other sources ofvibration masks the signal from the bearing defect unless it is

sufficiently large. Moreover the fault signatures are modulated tohigher frequency due to the resonance of the apparatus. Thismakes it difficult to extract these frequencies without first esti-mating and demodulating the resonance band.

2.2. Fourier transform of the bearing system

The Fourier transform decomposes a time signal f(t) into theconstituent sinusoids of independent frequency, amplitude andphase called the ‘spectrum’.

Forward Fourier transform converts the time signal into afrequency spectrum. The forward Fourier Transform is defined asfollows:

F ff ðtÞg ¼ f ðωÞ ¼ZR

f ðtÞ e� jωt dt; ω¼ f�π;πg ð2Þ

Table 1Characteristic defect frequencies.

Description Characteristicfrequency

Defect

BPFO: Ball Pass Frequency of Outer Race (Hz)—generated when all rolling elements roll across a defect in the outer race f rn2 1� d

D cosϕ� �

Outer race

BPFI: Ball Pass Frequency of Inner Race (Hz)—generated when all rolling elements roll across a defect in the inner race f rn2 1þ d

D cosϕ� �

Inner race

FTF: Fundamental Train Frequencya (Hz)—the frequency of the cage, generated when there is a defect in the cage f r12 1� d

D cosϕ� �

Cage

BSF: Ball Spin Frequencya (Hz)—the frequency generated by each rolling element as it spins. The Ball Fault Frequency (BFF)¼2�BSF isgenerated when there is a defect in the rolling components

f rD2d 1� d2

D2 cos 2ϕ� �

Rollingelement

n, number of balls; fr, angular frequency of the shaft (Hz); d, ball diameter (mm); D, pitch diameter (mm); ϕ, contact angle (radians).a Appearance of BSF or FTF is indicative of a ball/cage fault.

Fig. 2. Diagram of a deep-grove ball bearing.

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

time (s)

normalinner race defectball defectouter race defect

Fig. 3. Measured vibration signals during various bearing conditions.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 3

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 4: YuwonoRollingBearing

F f ðtÞ� �� f ðϖkÞ ¼XN�1

n ¼ 0

f ðnÞ � e� jωkn; ωk ¼2πkN

; n; k� �

AZ ð3Þ

where n denotes the sample index, N denotes the number ofsamples, ϖk denotes an interval statistic for the normalizedfrequency, ϖk ¼ ωk;ωkþð2π=NÞ� �

. ϖ is referred to as the “FastFourier Transform (FFT) bin”, or “frequency bin”.

The inverse Fourier transform does the otherwise

F �1 f ðωÞ� �¼ f ðtÞ ¼ 12π

ZR

f ðωÞ ejωt dω; ð4Þ

F �1 f ðωÞ� �� f ðnÞ ¼ 1N

XN�1

k ¼ 0

f ðϖkÞ � ejωkn; ωk ¼2πkN

n; k� �

AZ ð5Þ

whose parameters are similarly defined.The energy spectral density of ω is defined as follows:

j f ðωÞj 2 ¼ f ðωÞf ðωÞ ¼ T2s f ðϖÞf ðϖÞ; ð6Þ

where f ðωÞ denotes the complex conjugate of f ðωÞ.The power spectral density of ω is defined as follows:

SxxðωÞ ¼ limT-1

E1ffiffiffiT

pZ T

0f ðtÞe� jωt dt

2" #

¼ 1T

Z T

0

Z T

0E f ðtÞf ðt0Þh i

ejωðt� t0 Þ dt dt0; ð7Þ

SxxðωÞ � T2s

NTsf ðϖÞf ðϖÞ; ð8Þ

which in some sense a probability density function.Two properties of Fourier transform include the convolution

property:

F fðfngÞðtÞg ¼ f ðωÞgðωÞ; ð9Þand the Nyquist sampling theorem

F xðtÞX1

k ¼ �1δðt�kTsÞ

( )¼ 1Ts

X1k ¼ �1

xðω�kωsÞ ð10Þ

Using the convolution theorem and Nyquist sampling theorem,the Fourier transform of Eq. (1) can therefore be derived asfollows:

F f ðtÞ� �¼ f ðωÞ

¼F hðtÞn xðtÞ 1þXNn ¼ 1

X1k ¼ �1

ynðtÞδðt�kTnÞ !

þηðtÞ" #( )

;

¼F hðtÞnxðtÞ� �þF hðtÞnxðtÞXNn ¼ 1

X1k ¼ �1

ynðtÞδðt�kTnÞ( )

þF hðtÞnηðtÞ� �;

¼ hðωÞxðωÞzfflfflfflfflfflffl}|fflfflfflfflfflffl{periodic components

þ hðωÞxðωÞnXNn ¼ 1

X1k ¼ �1

1Tn

yn ω�kωnð Þ !

|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}amplitude modulated impulsive components; kωn ¼ defect harmonics

þ hðωÞηðωÞzfflfflfflfflfflffl}|fflfflfflfflfflffl{stochastic noise

: ð11ÞDeriving the frequency response the bearing system reveals animportant information that the rolling bearing vibration signalconsists of three constituents:

1. Deterministic periodic component:

hðωÞxðωÞwhich contains information of the shaft rotation and eccen-tricity. This component needs to be removed.

2. Amplitude modulated impulsive impacts:

hðωÞ xðωÞnXNn ¼ 1

X1k ¼ �1

1Tn

yn ω�kωnð Þ !" #

which contains the information about the bearing defectfrequencies and their harmonics

PNn ¼ 1

P1k ¼ �1ð1=TnÞyn

ω�kωnð Þ. The bearing signals are double modulated by boththe impulse response of the bearing system hðωÞ and theperiodic component xðωÞ. This component needs to be demo-dulated, denoised and interpreted.

3. Stochastic noise which contains no information. This compo-nent manifests as a colored noise in the frequency domain.

3. Method

This section explains the proposed approach to fault detectionin a rolling bearing system. The section is subdivided into threesubsections:

1. Section 3.1 outlines the essential signal processing guidelinesas suggested by Randall and Antoni (2011).

2. Section 3.2 discusses the proposed feature extraction method.3. Finally, Section 3.3 summarizes the defect classification system.

3.1. General signal processing guidelines

3.1.1. Periodic components removal using signal pre-whiteningPre-whitening attempts to remove the periodic components,

that is the hðtÞnxðtÞ component in Eq. (1), prior to further proces-sing. It has been argued that pre-whitening helps to enhance theimpulsive components such as the impacts due to defects (Randalland Antoni, 2011). Randall recommend using an autoregressive(AR) model (Randall and Antoni, 2011) as follows:

f ðnÞ ¼XPp ¼ 1

apf ðn�pÞþrðnÞ; ð12Þ

where P denotes the order of the AR model, rðnÞ denotes theresidual at the nth prediction and ap denotes the model para-meters which could be obtained by the solution of the Yule–Walker equations through the Levinson–Durbin recursion algo-rithm or Burg's method. P can be decided, for example, byminimizing Akaike Information Criterion (AIC) given as follows:

AICðPÞ ¼N log1N

XNn ¼ 1

r2ðnÞ( )

þ2PþNþ2; ð13Þ

where N denotes the number of samples, the length of theobservation used to optimize the model.

The whitened signal is obtained from the residual r(n) which isdefined as the difference between the observation f(n) and thefiltered signal from the convolution between the AR filtera¼ fa1;…; aPg and f(n) as follows:

rðnÞ ¼ f ðnÞ�XPp ¼ 1

apf ðn�pÞzfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflffl{ðanf ÞðnÞ

whose Fourier transform is

F frðnÞg ¼ rðωÞ ¼F f ðnÞ�ðanf ÞðnÞ� �;

¼ f ðωÞ� aðωÞf ðωÞ: ð14ÞSince the AR model will approximate the periodic component ofthe rolling bearing model (Eq. (11)), we can assume thataðωÞf ðωÞ � hðωÞxðωÞ. Substituting Eq. (11) we have

f wðωÞ ¼ rðωÞ � f ðωÞ� hðωÞxðωÞ;

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎4

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 5: YuwonoRollingBearing

� hðωÞxðωÞnXNn ¼ 1

X1k ¼ �1

1Tn

yn ω�kωnð Þ !

þ hðωÞηðωÞ; ð15Þ

which shows the elimination of periodic component using ARfilter. The next step is band-pass filtering this signal and demo-

dulating it at the resonance band hðωÞ which can be estimated

using spectral kurtosis. When the estimated resonance banddhðωÞ

– whose impulse response is approximately hðωÞ – is obtained,

f wðωÞ can be demodulated as follows:

f wdðωÞ � dhðωÞnf wðωÞ ð16Þ

where f wdðωÞ denotes the spectrum of the whitened and demo-dulated signal. The envelope spectrum is then be extracted from

f wdðωÞ after denoising the signal as follows:

envff ðωÞg ¼ envfdenoiseðf wdðωÞÞg ð17Þ

3.1.2. Resonance band estimation using spectral kurtosisRandall shows that the optimum filter that maximizes the

Signal to Noise Ratio (SNR) is a narrow band filter at the maximumvalue of Spectral Kurtosis (SK) (Randall and Antoni, 2011). Thisband is referred to in this paper as the resonance band.

SK measures the energy-normalised fourth-order spectralcumulant, i.e. a measure of the peakedness of the probabilitydensity function of the process at a frequency intervalϖAfω�Δ;ωþΔg (Valeriu Vrabie and Pierre Granjon, 2003). SKcan be calculated as follows:

KðϖÞ ¼ EfjXðϖÞj 4gEfjXðϖÞj 2g� �2�2 ð18Þ

where Ef�g denotes the expectation operator. In reality obtainingthe theoretical SK may be extremely difficult due to the limitedsampling frequency and random noise content. It is thereforenecessary to compute an unbiased estimate of SK using k-statistics(Valeriu Vrabie and Pierre Granjon, 2003) as follows:

bKðϖÞ ¼ MM�1

ðMþ1ÞPMm ¼ 1 jXmðϖÞj 4PM

m ¼ 1 jXmðϖÞj 2� �2 �2

264375 ð19Þ

whereM denotes the number of blocks of time–frequency spectro-gram such as those obtained using N-point DFT.

3.2. Proposed feature extraction method

This paper uses continuous wavelet transform to extract theresonance frequency; and cepstrum liftering to extract the defectvibration signatures from the noises.

� Wavelet transform has been argued to be capable of extractingweak signals for which FFT becomes ineffective that a defectcan be detected even at pre-spalling stage (Kulkarni andSahasrabudhe, 2013). Wavelet transform provides a variable-resolution time–frequency distribution which makes it rela-tively superior to FFT in this regard. Using the wavelet trans-form the resonance band can be approximated (Sawalhi andRandall, 2005).The raw vibration signal is then demodulated using theselected wavelet coefficient. The envelope characterizing thefault is obtained by multiplying the real wavelet coefficients bytheir complex conjugate.

� Cepstrum liftering (Randall and Hee, 1981) is a versatile tool fordetecting periodicity in a power spectrum, i.e. extractinguniformly spaced harmonics. The log spectrum of a local fault

is periodic in the frequency domain, which is the propertyexploited in cepstrum analyses.The cepstrum liftering is applied on the envelope cepstrum toseparate the local faults from other noises.

� Swarm Rapid Centroid Estimation (SRCE) (Yuwono et al., 2013b,2014) and Hidden Markov Model (HMM) (Zoubin, 2001) will beapplied to construct the automatic fault diagnosis system. SRCEwill be used for clustering the signal into a bag of words basedon the BPFI, BPFO, FTF, and BSF harmonic content. The wordsequence will then be used to train a HMM classifier whichoutputs the likelihood of possible defect.

3.2.1. Continuous wavelet transformThe continuous wavelet transform (CWT) Wψ of a continuous

function f(t) is the cross-correlation of the signal against thedilated and translated wavelet as follows:

Wψ f ða; bÞ ¼1cψ

jaj �1=2ZR

ψ t�ba

�� f ðtÞ dt;

aA ½0;1gbAf�1;1g

(ð20Þ

where ψ denotes the complex conjugate of the wavelet function ψ,a denotes the translation factor, while b denotes the scaling factor.As seen in Eq. (20), Wψ f ða; bÞ convolves f(t) with ψ ða; bÞðtÞ, thuscan also be interpreted a signal filtering process.

In this paper we use the Morlet wavelet, which is a complex-valued wavelet proposed by Grossman and Morlet in 1980(Goupillaud et al., 1984) as follows:

ψ ðtÞ ¼ 1ffiffiffiffiffiffiffiffiffiσ2π4

p eð� t2=2σ2Þejω0t ; whose Fourier transform is ð21Þ

F ψ ðtÞ� �¼ ψ ðωÞ ¼ 1ffiffiffiffiffiffiffiffiffiσ2π4

p HðωÞeð�1=2σ2Þðω�ω0Þ2 ð22Þ

where σ2 denotes the variance of the Gaussian envelope, whileHðωÞ denotes the Heaviside step function, that is HðωÞ ¼ 1 if ω40;HðωÞ ¼ 0 otherwise. The shape of the Morlet wavelet can beseen in Fig. 4.

The wavelet energy jWψ f ða; bÞj 2 can be calculated as follows:

jWψ f ða; bÞj 2 ¼Wψ f ða; bÞ �Wψ f ða; bÞ ð23Þwhere Wψ f ða; bÞ is calculated as in Eq. (20). Finally the waveletenvelope is obtained by taking the square root of the energy

envfWψ f ða; bÞg ¼ jWψ f ða; bÞj ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiWψ f ða; bÞ �Wψ f ða; bÞ

q: ð24Þ

As faults in rolling element bearings are mainly characterized bythe frequency characteristic of the envelope, this analytic propertyof the Morlet wavelet is particularly useful in a sense that theenvelope can be obtained without the need to do Hilbert trans-form (Feldman, 2011). To elaborate, an analytic signal has a form ofxrþ jxr where jxr is the Hilbert transform of xr, the real sequencewith 90 ○ phase shift. Hence, the convolution between a signal andthe Morlet wavelet will both filter and convert the result into ananalytic signal.

3.2.2. Wavelet kurtosisSince each wavelet decomposition involves a convolution

between the acceleration signal with a wavelet with the frequencyresponse of ψ aðωÞ (refer to Eq. (22)), the wavelet energy at dilationfactor a and translation factor b, jWψ f ða; bÞj 2, would equal theoverall energy of the pseudo-frequency band localized at transla-tion factor b. Under this assumption, we propose that SK can beapproximated at the pseudo-frequency band imposed by thedilated mother wavelet — ψ a ¼ψ t�b=a

� �— whose center fre-

quency is ωa. This kurtosis value has been referred to by Sawalhiand Randall (2005) as “wavelet kurtosis” Kψ ðϖaÞ. ϖa denotes the

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 5

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 6: YuwonoRollingBearing

spectral interval at the ath dilation factor, determined by thestandard deviation of the Gaussian envelope of the motherwavelet, σ. The wavelet kurtosis can be calculated as follows:

Kψ ðϖaÞ ¼EfjWψ f ða; bÞj 4gEfjWψ f ða; bÞj 2g� �2�2 where bABA ½0;1Þ; ð25Þ

which is estimated using sample WK as follows:

bKðϖÞ ¼ BB�1

ðBþ1ÞPBb ¼ 1 jWψ f ða; bÞj 4PB

b ¼ 1 jWψ f ða; bÞj 2� �2 �2

264375; ð26Þ

with B denotes the maximum translation factor applicable to Ndiscrete samples of the continuous signal.

3.2.3. Wavelet kurtogramThe wavelet kurtogram (Randall and Antoni, 2011; Sawalhi and

Randall, 2005) is a matrix of wavelet kurtosis resulted fromdecompositions using various dilation factor a1;2;…;N and filterwidth σ1;2;…;M . The appropriate a and σ are found from the waveletkurtogram as follows:

1. Find from the wavelet kurtogram, the optimum dilation factor a†

whose decomposition maximizes the median wavelet kurtosis.2. Given a†, find the optimum filter bandwidth σ† that maximizes

the wavelet kurtosis.

Selection of the resonance band and demodulation using waveletkurtogram is shown in Fig. 5.

3.2.4. Cepstrum analysis and cepstral editingThe cepstrum CðτÞ (Randall and Hee, 1981) is calculated as an

inverse Fourier transform of the logarithm of the spectrum asfollows:

Cff ðtÞg ¼ �f ðτÞ ¼F �1flogF ff ðtÞgg ð27Þand inversed from the cepstrum as follows:

C�1f�f ðτÞg ¼ f ðωÞ ¼ exp F Cff ðtÞg� �� � ð28Þ

C�1f�f ðτÞg ¼ f ðωÞ ¼ exp F F �1flogF ff ðtÞgg� �� � ð29Þwhere F and F �1 denotes the forward and the inverse Fouriertransform, respectively. A uniformly spaced harmonics in thefrequency domain will manifest as a single peak, a ‘rahmonicecho’, in the quefrency domain (Randall and Hee, 1981). Animportant property of cepstrum is that convolution in the timedomain is an addition in the quefrency domain, that is

Cfðhnf ÞðtÞg ¼F �1flogF fðhnf ÞðtÞgg ð30Þ

Cfðhnf ÞðtÞg ¼F �1 log hðωÞf ðωÞ� �n o

¼F �1 log hðωÞn o

þF �1 log f ðωÞn o

ð31Þ

which means that any convolution operation, for instance theeffects of the transfer path, can be reversed by removing theappropriate rahmonics in the quefrency domain. Removal of non-essential rahmonics such as those from the shaft rotation can alsohelp the discovery of smaller defects which are often harder todetect. This process of retaining/removing rahmonics in thequefrency domain prior to inversing the cepstral transform is alsoknown as ‘Liftering’ done simply using the following equation:

Lifterf�f ðτÞ;HðτÞg ¼ C�1 �f ðτÞjHðτÞjn o

ð32Þ

where HðτÞ is a cepstrum editor/lifter function in a form of auniformly spaced comb lifter to either remove (HðτÞ ¼ ‘0’) or retain(HðτÞ ¼ ‘1’) rahmonics in the quefrency domain.

3.2.5. Feature vector creationThe features used in this paper is extracted using the following

steps:

1. Extract the fundamental shaft frequency fr from the quefrencycepstrum of the raw signal. The shaft rotational frequency iseasily discerned as the largest quefrency peak nearest to thereported shaft rotational frequency.

1f r

¼ τfr ¼ arg maxj �f ðτÞj

C f ðtÞ� �rect

τ�1=f rα

� � �; ð33Þ

where α denotes the tolerance parameter.2. Whiten the signal using an AR filter.3. Apply continuous wavelet transform.4. Estimate resonance band using wavelet kurtogram.5. Demodulate the signal and extract the envelope spectrum.6. Lifter the envelope cepstrum

Lifterf�f ðτÞ;H1ðτÞ;H2ðτÞg ¼ C�1 �f ðτÞjH1ðτÞj jH2ðτÞjn o

ð34Þ

where the two lifter functions, H1ðτÞ and H2ðτÞ are defined asfollows:

H1ðτÞ ¼minX4n ¼ 1

XT=2k ¼ 1

rectτ�k=f n

β

�; 1

( ); ð35Þ

−3 −2 −1 0 1 2 3 −1−0.5

00.5

1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

real

normalized time

imag

inar

y

−3 −2 −1 0 1 2 3−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

normalized time

ampl

itude

realimaginary

Fig. 4. Morlet wavelet (a¼6.4, b¼0. ω ¼ 1).

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎6

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 7: YuwonoRollingBearing

H2ðτÞ ¼ 1�XT=2k ¼ 1

rectτ�k=f r

β

�; ð36Þ

where T denotes the FFT window width, f 1;2;3;4 ¼fBPFO; BPFI; FTF;2� BSFg denotes the defect frequencies (Hz), frdenotes the shaft rotating frequency (Hz), while β denotes thewidth of the rectangular pulses (quefrencies).

7. Extract the power spectral density features from the lifteredenvelope spectrum from both Drive End (DE) and Fan End (FE)accelerometer using the method summarized in Table 2.

3.3. Proposed framework for automatic fault diagnosis

3.3.1. Optimizing Markov model's hidden states using Swarm RapidCentroid Estimation

We utilize Yuwono's Swarm Rapid Centroid Estimation (SRCE)(Yuwono et al., 2013b, 2014) for estimating the hidden statevariables for the Hidden Markov Model. SRCE incorporates theparadigms of Particle Swarm Optimization (PSO, van der Merweand Engelbrecht, 2003) to enhance the traditional ExpectationMaximization (EM) algorithm. The following details will be basedon the basic SRCE notations defined in Table 3. Note that this paperfollows Ensemble RCE (ERCE) construct (Yuwono et al., 2014)which differs from the original RCE (Yuwono et al., 2012a,b,2013b) in a sense that the social and cognitive terms are droppedfor improving scalability.

A particle in an SRCE subswarm stores a tuple consisting of aposition vector x and a velocity vector v, that is pk;m ¼ fxk;m; vk;mg(Table 3, no. 3). The position vector of each particle represents thecoordinate of a centroid vector xkARdim.

A subwarm is a collection of centroid coordinates, encoding apossible solution to the clustering problem. As the RCE swarmconsists of M of such subswarm, at the end of optimization, asmany as M clustering solutions can be obtained.

Each subswarm stores two memory matrices:

1. The self-organizing memory Ym (Table 3, no. 10), which is anarray of randomly sampled pointers to the data Y.

2. The best position memory Xbestm (Table 3, no. 12) which stores

the position vectors Xm ¼ fx1;…; xKm g that minimizes an objec-tive function which is usually defined as, but not restricted to,the average distortion

minimize

Pxk AXm

Pyi AYm

uik;mdðxk; yiÞPyi AYm

uik;m

s:t:Xi

uik;m ¼ 1

uik;m ¼ dðyi; xk;mÞ� ð1=λ�1ÞPKj ¼ 1 dðyi; xj;mÞ�ð1=λ�1Þ; λ41 uA 0;1f g: ð38Þ

The RCE swarm Xbest matrix is then stored as the union of allXbest

m (Table 3, no. 11).

WK = 38.24

Dilation factor (a)

Sta

ndar

d de

viat

ion

(σ) o

f the

mor

let w

avel

et

2 4 6 8 10 12 14 16 18 20 22 241

2

3

4

5

6

7

8

Dila

tion

fact

or (a

)

time

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

2

4

6

8

10

12

14

16

18

20

22

24

50 100 150 200 250 300 350 400 450 500

100

200

300

400

500

600

700

800

frequency

pow

er s

pect

rum

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.040

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01

quefrency

pow

er c

epst

rum

Fig. 5. Maximization of wavelet kurtosis via the wavelet kurtogram method (top left) and its resulting demodulated Wavelet coefficients at σ† ¼ 1 and a† ¼ 4:23 (top right)which shows accentuated Ball Pass Frequency Outer Race (BPFO¼87.93 Hz, or every 11.37 ms) impulses due to a defect of 21 mils diameter on the outer raceway of thebearing at the Fan-End of the testing apparatus. Shaft was rotating at 1730 RPM at maximum load (3HP). The envelope power spectrum (lower left) shows the harmonics ofthe theoretical BPFO frequency (88.4 Hz) and the shaft rotational frequency (29 Hz). The envelope cepstrum shows BPFO rahmonics at 11.31 ms and the shaft rahmonics to beremoved in the liftering process.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 7

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 8: YuwonoRollingBearing

On each iteration, the velocity and position of a particle isupdated (Table 3, nos. 6 and 8) by adding the resultant vector Δ(Table 3, no. 9) to the velocity vector v and updating the positionvector x. Δ is defined as follows (Yuwono et al., 2014):##

Δk;mðtÞ ¼φ1○PjYm j

i ¼ 1 uik;mðyi�xk;mðtÞÞPjYm ji ¼ 1 uik;m

!zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{self organizing

þφ2○PjXbest j

j ¼ 1 qjk;mðxbestj ðtÞ�xk;mðtÞÞPjXbest j

j ¼ 1 qjk;m

0@ 1Azfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{best position

;

¼φ1○ E Ym jxk;m� ��xi;m

� �þφ2○ E Xbest jxk;m

h i�xk;m

� �; ð39Þ

where φAf0;1gARdim – a uniform random vector; uik;m – clustermembership when Ym is mapped to Xm; qjk;m – cluster member-ship when Xbest is mapped to Xm. When the self-organizing termequates to 0, the particle will be directed to xI win;m instead(Table 3, no. 5).

The SRCE is equipped with two strategies to reduce the risk ofsuboptimal convergence including substitution and particle resetsummarized in Table 4.

The algorithm pseudocode is shown in Algorithm 1. An illust-ration of the search trajectory of the swarm on a toy example isshown in Fig. 6.

Table 2Feature vector description.

No. Feature Extraction method Description

1 BPFODE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k=BPFODE

γ

� Drive End BPFO

2 BPFIDE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k=BPFIDEγ

� Drive End BPFI

3 FTFDE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k=FTFDEγ

� Drive End FTF

4 BFFDE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k= 2BSFDEð Þγ

� Drive End Ball Fault Frequency (BFF)¼2�BSF

5 BPFOFE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k=BPFODE

γ

� Fan End BPFO

6 BPFIFE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k=BPFIDEγ

� Fan End BPFI

7 FTFFE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k=FTFDEγ

� Fan End FTF

8 BFFFE P3k ¼ 1 f

envDE þ f

envFEÞsinc

τ�k= 2BSFDEð Þγ

� Fan End Ball Fault Frequency (BFF)¼2�BSF

γdenotes a parameter controlling the width of the sinc pulses which in this paper is set to be equal to β for simplicity.The features are power spectral density features, extracted from both Drive End (DE) and Fan End (FE) liftered envelope cepstrum as follows:

fenv ¼ T2

s

NTsenv f ðωÞ

n oenv f ðωÞg;

nð37Þ

where N denotes the length of the DFT vector, Ts ¼ 1=f s , the time between samples.It is important to note that since DE and FE bearings are of different types, the cepstrum liftering uses different parameters for DE and FE as follows:� DE envelope cepstrum uses f1,2,3,4¼{BPFODE, BPFIDE, FTFDE, 2�BSFDE} for the lifter functions GðτÞ and HðτÞ.� FE envelope cepstrum uses f1,2,3,4¼{BPFOFE, BPFIFE, FTFFE, 2�BSFFE} for the lifter functions GðτÞ and HðτÞ.

Table 3SRCE basic notations and formulas.

No. Notation Formula Description

1. Θ Θ¼ fΘ1;…;Θmg A swarm: a collection of M subswarms2. Θm Θm ¼ fp1;m ;…;pK;m;X

bestm g A subwarm: a collection of particles; Each subswarm stores the best position memory matrix. A

subswarm encodes a possible solution to the clustering problem3. pk;m pk;m ¼ fxk;m ; vk;mg A particle in an SRCE subswarm stores a fx; vg tuple4. xk;m xk;mARd The position vector of a particle k in subswarm m, representing the coordinate of a centroid

vector in Rd

5. xIwin ;m The winning particle: a particle in the mth subswarm whose cluster has the largest cardinality6. bxk;m bxk;m’xk;mþvk;m The position vector of pk;m in the subsequent iteration7. vk;m vk;mARd The velocity vector of a particle in Rd

8. bvk;m bvk;m’vk;mþΔk;m The velocity vector of pk;m in the subsequent iteration9. Δ† Δ† ¼φ1○ E Ym jxk;m

� ��xi;m� �þφ2○ E Xbest jxk;m

h i�xk;m

� �The resultant vectora

10. Ym Ym ¼ randsampleðY; η%Þ The self-organizing memory – an array of randomly sampled pointers to the data Y. η%Af0;1gdenotes the rate of random sampling

11. Xbest Xbest ¼⋃Mm ¼ 1X

bestm The swarm best position matrix: the union of all Xbest

m

12. Xbestm Xbest

m ¼ fx1 ;…; xKm g The subswarm's best position memory which stores the position vectors that minimizes a givenobjective function throughout the search

13. f ðY;XÞb The objective functionb

14. dð�; �Þ A distance function mrule

a The resultant vector consist mainly of the self-organizing term and best position term as noted in Eq. (39).b A possible objective function is the average distortion as noted in Eq. (38).

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎8

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 9: YuwonoRollingBearing

Algorithm 1. Swarm RCErþ .

Input: Data points Y¼ fy1;…; yNgARdim, # of clusters K.Output: Swarm centroid vectors

Xbest ¼ fXbest1 ;Xbest

2 ;…;XbestM gARdim.

1: Initialize the swarm (randomizeðX1;…;MÞ, V1;…;M ¼ 0).2: For each subswarm m, randomly sample Y and store it in

the memory Ym ¼ randsampleðY;η%Þ.3: repeat4: for all mAf1;…;Mg do5: Calculate Um ¼ fu1;m;…;ujYm j ;mg from the pairwise

distance between Xm and Ym,6: Calculate Qm from the pairwise distance between Xm

and Xbest ,7: Store Xbest

m which minimizes f ðYm;XmÞ throughout thesearch,

8: Vm’VmþΔm,9: Xm’XmþVm,10: Redirect particles with zero cardinality towards the

particle whose cluster has the largest cardinality.11: Apply substitution with rate of ε12: if f ðYm;X

bestm Þ does not improve after ζreset iterations

then13: Reinitialize subswarm (randomizeðXmÞ, Vm ¼ 0)14: end if

15: end for16: until Convergence or maximum iteration reached

17: return Xbest ¼ fXbest1 ;Xbest

2 ;…;XbestM gARdim.

3.3.2. Choice of distance matrixThe feature used in the paper is the power spectral density

which is inherently a probability distribution. The Jensen–Shan-non (JS) distance (square root of JS divergence) is a symmetricmeasure for measuring dissimilarity between probability distribu-tions (Fuglede and Topsoe, 2004) and is appropriate given thenature of the feature vector.

Given two probability density functions P ¼ pðxÞ and Q ¼ qðxÞ,JS-divergence JS and JS-distance JSd are calculated as follows:

JSðP j jQ Þ ¼ KLðP j jMÞþKLðQ j jMÞ2

; ð40Þ

JSdðP;Q Þ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiJSðP j jQ Þ

pð41Þ

where M denotes a mixture distribution of Q and P

M¼ 12 ðPþQ Þ: ð42Þ

and KLð�j j �Þ denotes the Kullback–Leibler divergence (Kullbackand Leibler, 1951)

KLðP j jQ Þ ¼ �XxpðxÞlog qðxÞ

zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{HðP;Q Þ

þXxpðxÞlog pðxÞ

zfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflffl{�HðPÞ

;

¼XxpðxÞlogpðxÞ

qðxÞ; ð43Þ

where H in this equation denotes Shannon's Information Entropy(Shannon, 2001).

Table 4SRCE strategies for reducing the risk of suboptimal convergence.

Formula explanation Description/effects

Substitution fbx i ; bv ig ¼fN ðxI win; σÞ;0g if φoε

fbx i ; bv ig otherwise

(

φ– a uniform random number φAf0;1g, N ðxI win ; σÞ — a Gaussian random vector

centered in xI win with a standard deviation of σARd . ε — substitutionprobability threshold. Optimal ε values lie between 0:01rεr0:05 (Yuwonoet al., 2013b, 2012c)

Forces particles in a search space to reach alternate equilibrium positions byintroducing position instability. This strategy is applied after each position updateepisode for a particle

Particle reset ζ’ζþþ if f ðYm; bXmÞZ f ðYm;X

bestm ðtÞÞ

0 otherwise

(:

ζ– stagnation counter, an integer A large ζ indicates that the corresponding subswarm needs to be re-initialized.

Convergence can be potentially detected when f ðYm ;Xbest Þ does not improve after

numerous resets

Fig. 6. Trajectory of the SRCE particles recorded after 30 iterations on a toy dataset with numerous random seeding shows SRCE robustness and insensitivity to initialization.M¼6, tmax ¼ 30, ε¼ 0:05, ζreset ¼ 15.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 9

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 10: YuwonoRollingBearing

3.3.3. Hidden Markov ModelThe Hidden Markov Model (HMM) is a probabilistic tool for

representing a probability distribution Y (e.g. a class of bearingdefect) over sequences of observations/hidden states S (e.g. mem-bership to a Gaussian Mixture) as follows:

PðS1:T ;Y1:T Þ ¼ PðS1ÞPðY1 jS1Þ ∏T

t ¼ 2PðSt jSt�1ÞPðYt jStÞ ð44Þ

where PðSt jSt�1Þ is the state transition matrix, which can beoptimized using the Baum–Welch algorithm.

Further discussion on HMM and the Baum–Welch algorithm willnot be discussed in detail in this paper. Readers are encouraged torefer to Ghahramani's paper for more information (Zoubin, 2001).

4. Experimental results

4.1. Data description

The method was tested using the vibration test data providedby Case School of Engineering of the Case Western ReserveUniversity (CSE) (Case Western Reserve, 2014). The data collectionprocess were described as follows. The testing apparatus consistsof a 2 HP motor, a torque transducer/encoder, a dynamometer andcontrol electronics. The test bearings support the motor shaft.Single point faults were introduced to the test bearings usingelectro-discharge machining with defect diameters of 7 mils,14 mils, 21 mils, 28 mils, and 40 mils (1 mil¼0.001 in). In thisexperiment we mainly focus on the 7 mils, 14 mils and 21 milsdefects, whose vibration signal are relatively subtler than thelarger diameter defects.

CSE collected the vibration data using accelerometers attachedto the housing with magnetic bases. Two accelerometers wereplaced at the 12 o'clock position at the drive end (DE) and fan end(FE) of the motor housing. The data was sampled at 12,000samples per second. Speed and horsepower data were collectedusing the torque transducer/encoder and were recorded by handthus subject to human error. The fundamental shaft frequency frwere calculated using Eq. (33). The specification of the bearingused in both ends is shown in Table 5.

For outer raceway faults, experiments were conducted for bothfan and drive end bearings with defects located at 3 o'clock(directly in the load zone), at 6 o'clock (orthogonal to the loadzone), and at 12 o'clock.

CSE recorded approximately 1164 s vibration signals whichwere grouped into 97 separate Matlab data files according to thedefect locations and severity. Each of these data files weresubdivided into six sequential segments from which six observa-tions were extracted using the method described in Section 3.2.5,generating a set of 582 sequential observations. In order to avoidover-fitting, only 14% of these sequence – or more precisely, 80

observations – were randomly selected for the clustering andtraining process.

4.2. Results and discussion

The parameters for the experiment were set as follows. Thenumber of discrete samples (N) taken at each extraction was set to12,000 samples. The tolerance parameter α was set to 1 ms.

Parameters for the wavelet transform was as follows, standarddeviation of the Morlet wavelet was set to σ ¼ f1;2;4;8g, dilationfactor was linearly spaced from a1 ¼ 1 to a60 ¼ 24.

The quefrency width of the cepstral lifters β was set to 5 ms.The parameters for the SRCE were set as follows. The number of

subswarms was set to 4, the distance function was set to JSdistance, the fuzzifier λ was set to 1.2, substitution probability εwas set to 3%, maximum iteration was set to 1000 iterations,resampling rate η% was set to 90%, particle reset threshold ζresetwas set to 15, convergence was declared when the averagedistortion of the swarm did not improve after 3 successiveunsuccessful particle resets.

An example run of SRCE using four particles is presented inFig. 7. In this figure, both the FE and DE features were averaged for

Table 5Bearing specification.

Parameter Value

Drive End Fan End

Bearing type 6205-2RS JEM SKF 6203-2RS JEM SKFBall diameter (in) 0.3126 0.2656Pitch diameter (in) 1.537 1.122BPFO f r � 3:5848 f r � 3:0530BPFI f r � 5:4152 f r � 4:9469FTF f r � 0:39828 f r � 0:3817BSF f r � 4:7135 f r � 3:9874

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

BPFO(DE+FE)/2

BP

FI(D

E+F

E)/2

0.2 0.4 0.6

BFF(DE+FE)/2

0 0.5 1

FTF(DE+FE)/2

0 0.2 0.4 0.6 0.8

BPFI(DE+FE)/2

0 0.2 0.4 0.6 0.8

0.2

0.4

0.6

BPFO(DE+FE)/2

BFF

(DE

+FE

)/2

0

0.5

1

FTF (D

E+F

E)/2

0

0.2

0.4

0.6

0.8B

PFI

(DE

+FE

)/20

0.2

0.4

0.6

0.8

BP

FO(D

E+F

E)/2 1

234

Fig. 7. Visualizing SRCE clustering result on the training data using four subswarmswith four particles each. The fuzzy partitions are projected to BPFI vs. BPFO axes.Bearing faults can be generally clustered into four main clusters based on thepower spectral density features.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎10

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 11: YuwonoRollingBearing

ease of interpretation. Defects were shown to be distributed inthree hyperellipsoids, with a few exceptions in cluster 3 showing arelatively higher degree of FTF harmonics.

The hidden states for HMM were optimized using SRCE byincrementing the number of particles and investigating whetherthere were any improvement in terms of predictive power. Theresult of the defect classification experiment using HMM ispresented in Table 6. The proposed defect detection algorithmachieved on average 98.02% sensitivity and 96.03% specificity indistinguishing the source of defects. 100% of fan end defects wereseen to be properly distinguished. The detection algorithmachieved lower performance in distinguishing drive end defectswith the highest error rate on the 6 o'clock outer race defect.

An example run on the test set is shown in Fig. 8. The reliabilityof the proposed method on distinguishing bearing defects isshown in this figure. The optimum hidden states were found tobe 7. It can be seen that further increment did not seem to improvethe overall classification performance.

We also observed some limitations in the proposed method asthe 6 o'clock outer race defects on the drive end were seenmisclassified as ball defect. Data ID: 197 – 200 records thevibration signal of an outer race defect on the drive end bearing(14 mils) located at 6 o'clock of the load zone. Fig. 9 shows a

Table 6HMM classification summary on the test set statistics obtained from 80 episodes ofre-training and re-testing. The HMM uses seven hidden states optimized usingSRCE (four subswarms, seven particles).

Defect location Sensitivity Specificity Error rate

μ (%) σ (%) μ (%) σ (%) μ (%) σ (%)

Drive EndInner race 100.00 0.00 100.00 0.00 0.00 0.00Ball defect 97.95 0.47 95.90 0.95 2.73 0.63Outer race6 o'clock (centered) 85.83 0.00 71.67 0.00 18.89 0.003 o'clock (orthogonal) 100.00 0.00 100.00 0.00 0.00 0.0012 o'clock (opposite) 100.00 0.00 100.00 0.00 0.00 0.00

Summary 96.31 0.19 92.61 0.38 4.92 0.25

Fan EndInner race 100.00 0.00 100.00 0.00 0.00 0.00Ball defect 100.00 0.00 100.00 0.00 0.00 0.00Outer race6 o'clock (centered) 100.00 0.00 100.00 0.00 0.00 0.003 o'clock (orthogonal) 100.00 0.00 100.00 0.00 0.00 0.0012 o'clock (opposite) 100.00 0.00 100.00 0.00 0.00 0.00

Summary 100.00 0.00 100.00 0.00 0.00 0.00

Overall 98.02 0.10 96.03 0.20 2.65 0.13

2 4 6 8 10 120

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

number of hidden states

valu

e

SensitivitySpecificityError Rate

50 100 150 200 250 300 350 400 450

Outer Race

Ball

Inner Race

sample number

ground truthsubswarm 1subswarm 2subswarm 3subswarm 4

Fig. 8. Top: the method used for determining the optimum number of hidden states for the HMM. It can be observed that no improvement on specificity/sensitivity isobtained when the number of hidden states is larger than 7. Bottom: HMM prediction results (seven hidden states, four subswarms) on the testing data. Outer Race Defect onthe Drive End Bearing (14 mils) located at 6 o'clock (data ID: 197 – 200) were often confused as a ball defect.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 11

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 12: YuwonoRollingBearing

comparison between the signal and power spectrum of data ID:197 with another drive end outer race defect from data ID: 144.This figure reveals a high degree of noise content masking theBPFO frequency component. FTF harmonics were especially seento prevail over the BPFO harmonics which gave a possibleexplanation as to why misclassification occurred.

5. Conclusion

This paper proposes an automatic bearing fault diagnosis systembased on Swarm Rapid Centroid Estimation (SRCE) and HiddenMarkov Model (HMM). Using the features extracted with waveletkurtogram and cepstrum liftering, the proposed SRCEþHMMmethod were tested using the bearing fault vibration data providedby Case School of Engineering of the CaseWestern Reserve University(CSE) (Case Western Reserve, 2014) with promising results. Themethod achieved on average the sensitivity, specificity, and errorrate of 98.02%, 96.03%, and 2.65%, respectively. 100% of fan enddefects are seen to be properly distinguished. A rather significanterror rate (18.89% compared to the overall error rate of 2.65%70.13%)were observed on the 6 o'clock outer race defect vibration whichwould require further investigation.

Acknowledgments

This research, “Fault Detection and Identification of Key Equip-ment in Rail Transit Based on Multiple Data Fusion,” was spon-sored by Beijing Jiaotong University (NJTU) and was part of aresearch partnership on the Rail Traffic Control and Safety (RTCS).The bearing fault vibration data were provided by the courtesy ofCase School of Engineering of the Case Western Reserve University(CSE) (Case Western Reserve, 2014).

References

Csegroups.case.edu, 2014. Case Western Reserve University Bearing Data CenterWebsite. URL ⟨http://csegroups.case.edu/bearingdatacenter/pages/welcome-case-western-reserve-university-bearing-data-center-website⟩.

Fang, S., Zijie, W., 2007. Rolling bearing fault diagnosis based on wavelet packet andrbf neural network. In: Control Conference, 2007. CCC 2007. Chinese, 2007,pp. 451–455. http://dx.doi.org/10.1109/CHICC.2006.4346979.

Feldman, M., 2011. Hilbert transform in vibration analysis. Mech. Syst. SignalProcess. 25 (3), 735–802. http://dx.doi.org/10.1016/j.ymssp.2010.07.018, URLhttp://www.sciencedirect.com/science/article/pii/S0888327010002542.

Fuglede, B., Topsoe, F., 2004. Jensen–Shannon divergence and Hilbert spaceembedding. In: Proceedings. International Symposium on Information Theory(ISIT), 2004, 2004, p. 30. http://dx.doi.org/10.1109/ISIT.2004.1365067.

Goupillaud, P., Grossmann, A., Morlet, J., 1984. Cycle-octave and related transformsin seismic signal analysis. Geoexploration 23 (1), 85–102. http://dx.doi.org/10.1016/0016-7142(84)90025-5.

Guo, Y., Dehestani, D., Li, J., Wall, J., West, S., Su, S., 2012. Intelligent outlier detectionfor hvac system fault detection. In: Proceedings of the 10th InternationalHealthy Buildings Conference, Brisbane, Queensland, Australia.

Guo, Y., Wall, J., Li, J., West, S., 2013. Intelligent model based fault detection anddiagnosis for hvac system using statistical machine learning methods. In:Proceedings of the ASHRAE 2013 Winter Conference, Dallas, USA.

Huang, Y.-T., Cheng, F.-T., Hung, M.-H., 2009. Developing a product quality faultdetection scheme. In: IEEE International Conference on Robotics and Auto-mation, 2009. ICRA '09, pp. 927–932. http://dx.doi.org/10.1109/ROBOT.2009.5152474.

Kulkarni, P.G., Sahasrabudhe, A.D., 2013. Application of wavelet transform for faultdiagnosis of rolling element bearings. Int. J. Sci. Technol. Res. 2 (4), 138–148.

Kullback, S., Leibler, R.A., 1951. On information and sufficiency. Ann. Math. Stat. 22(1), 79–86. http://dx.doi.org/10.1214/aoms/1177729694.

Lei, Y., Lin, J., He, Z., Zi, Y., 2011. Application of an improved kurtogram method forfault diagnosis of rolling element bearings. Mech. Syst. Signal Process. 25 (5),1738–1749. http://dx.doi.org/10.1016/j.ymssp.2010.12.011, URL http://www.sciencedirect.com/science/article/pii/S0888327011000033.

Li, S., Wen, J., 2014. A model-based fault detection and diagnostic methodologybased on pca method and wavelet transform. Energy Build. 68, 63–71.

Li, B., Chow, M.-Y., Tipsuwan, Y., Hung, J., 2000. Neural-network-based motor rollingbearing fault diagnosis. IEEE Trans. Ind. Electron. 47 (5), 1060–1069. http://dx.doi.org/10.1109/41.873214.

Publications, B.U., 2007. Bearing Failure: Causes and Cures. URL ⟨http://www.schaeffler.com/remotemedien/media/_shared_media/08_media_library/01_publications/barden/brochure_2/downloads_24/barden_bearing_failures_us_en.pdf⟩.

Randall, R.B., Antoni, J., 2011. Rolling element bearing diagnostics—a tutorial.Mech. Syst. Signal Process. 25 (2), 485–520. http://dx.doi.org/10.1016/j.ymssp.2010.07.017.

Randall, R.B., Hee, J., 1981. Cepstrum analysis. Tech. Rev. Adv. Tech. Acoust. Electr.Mech. Meas. 3–40 ⟨Http://www.bksv.com/doc/TechnicalReview1981-3.pdf⟩.

Sawalhi, N., Randall, R.B., 2005. Spectral kurtosis optimization for rolling elementbearings. In: Proceedings of the ISSPA Conference, Sydney, Australia, pp. 839 – 842.

Shannon, C.E., 2001. A mathematical theory of communication. SIGMOBILE Mob.Comput. Commun. Rev. 5 (1), 3–55. http://dx.doi.org/10.1145/584091.584093.

Slocum, A., 2008. Fundamentals of Design—Topic 10 Bearings. URL ⟨http://web.mit.edu/2.75/fundamentals/FUNdaMENTALs%20Book%20pdf/FUNdaMENTALs%20Topic%2010.PDF⟩.

Valeriu Vrabie, C.S., Pierre Granjon, 2003. Spectral kurtosis: from definition toapplication. In: Sixth IEEE International Workshop on Nonlinear Signal andImage Processing (NSIP 2003).

van der Merwe, D.W., Engelbrecht, A.P., 2003. Data clustering using particle swarmoptimization. In: Proceedings of the 2003 IEEE Congress on EvolutionaryComputation, 2003, vol. 1, pp. 215–220.

Venkatasubramanian, V., Rengaswamy, R., Yin, K., Kavuri, S.N., 2003. A review ofprocess fault detection and diagnosis: Part i: quantitative model-based

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

time (s)0 50 100 150 200 250 300 350 400 450 500

0

1

2

3

4

5

6 x 10−3

Frequency(Hz)

pow

er s

pect

ral d

ensi

ty

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

time (s)0 50 100 150 200 250 300 350 400 450 500

0

0.05

0.1

0.15

0.2

Frequency(Hz)

pow

er s

pect

ral d

ensi

ty

Fig. 9. Comparison between outer race defect time signals and power spectral density of 6 o'clock reading (data ID: 197 – top) and 3 o'clock reading (data ID: 144 – bottom).197 was significantly noisier and often misclassified as a ball defect. Compared to 144, the BPFO harmonics in 197 were buried under FTF harmonics which gave a possibleexplanation as to why misclassification occurred.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎12

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i

Page 13: YuwonoRollingBearing

methods. Comput. Chem. Eng. 27 (3), 293–311. http://dx.doi.org/10.1016/S0098-1354(02)00160-6, URL http://www.sciencedirect.com/science/article/pii/S0098135402001606.

Wall, J., Guo, Y., Li, J., West, S., 2011. A dynamic machine learning-based techniquefor automated fault detection in hvac systems. In: Proceedings of the ASHRAEAnnual Conference, Montreal, Quebec, Canada, pp. 449–456.

Yuwono, M., Su, S.W., Moulton, B.D., Nguyen, H.T., 2012a. Fast unsupervisedlearning method for rapid estimation of cluster centroids. In: Proceedings ofthe 2012 IEEE Congress on Evolutionary Computation, pp. 889–896.

Yuwono, M., Su, S.W., Moulton, B.D., Nguyen, H.T., 2012b. Method for increasing thecomputation speed of an unsupervised learning approach for data clustering.In: Proceedings of the 2012 IEEE Congress on Evolutionary Computation,pp. 2957–2964.

Yuwono, M., Su, S.W., Moulton, B.D., Nguyen, H.T., 2012c. Optimization strategiesfor rapid centroid estimation. In: Proceedings of the 34rd Annual InternationalConference of the IEEE EMBS, San Diego, pp. 6212–6215.

Yuwono, M., Su, S.W., Guo, Y., Li, J., West, S., Wall, J., 2013a. Automatic featureselection using multiobjective cluster optimization for fault detection in aheating ventilation and air conditioning system. In: Proceedings of the 2013First International Conference on Artificial Intelligence, Modelling and Simula-tion, AIMS ’13, IEEE Computer Society, Washington, DC, USA, pp. 171–176.http://dx.doi.org/10.1109/AIMS.2013.34.

Yuwono, M., Su, S., Moulton, B., Nguyen, H., 2013b. Data clustering using variants ofrapid centroid estimation. IEEE Trans. Evol. Comput. 18 (3), 366–377.

Yuwono, M., Su, S., Moulton, B., Nguyen, H., 2014. An algorithm for scalableclustering: ensemble rapid centroid estimation. In: Proceedings of the 2014IEEE Congress on Evolutionary Computation, pp. 1250–1257.

Zoubin, G., 2001. An introduction to hidden Markov model and Bayesian network.Int. J. Pattern Recognit. Artif. Intell. 15, 9–42. http://dx.doi.org/10.1142/s0218001401000836.

M. Yuwono et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 13

Please cite this article as: Yuwono, M., et al., Automatic bearing fault diagnosis using particle swarm clustering and HiddenMarkov Model. Eng. Appl. Artif. Intel. (2015), http://dx.doi.org/10.1016/j.engappai.2015.03.007i