Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
University of Calgary
PRISM: University of Calgary's Digital Repository
Graduate Studies Legacy Theses
2001
Bearing condition monitoring and fault diagnosis
Chen, Ping
Chen, P. (2001). Bearing condition monitoring and fault diagnosis (Unpublished master's thesis).
University of Calgary, Calgary, AB. doi:10.11575/PRISM/23398
http://hdl.handle.net/1880/40657
master thesis
University of Calgary graduate students retain copyright ownership and moral rights for their
thesis. You may use this material in any way that is permitted by the Copyright Act or through
licensing that has been assigned to the document. For uses that are not allowable under
copyright legislation or licensing, you are required to seek permission.
Downloaded from PRISM: https://prism.ucalgary.ca
THE UNIVERSITY OF CALGARY
Bearing Condition Monitoring and Fault Diagnosis
by
Ping Chen
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF MECHANICAL AND MANUFACT'UMNG ENGINEERING
CALGARY, ALBERTA
DECEMBER, 2000
0 Ping Chen 2000
National Library BibliotMque nationale du Canada
Acquisitions and Acquisitions et Bibliographic Services services bibliiraphiques 395 woahgtm Street 395. rue WsOingeml OItawoON K l A W OlhwaON K l A W CMada Canada
The author has granted a non- exclusive licence allowing the National Li* of Canada to reproduce, loan, distn'bute or sell copies of this thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts &om it may be printed or otherwise reproduced without the author's permission.
L'auteur a accord6 une licence non exclusive pennettant a la BibliothQue nationale du Canada de reprochire, pr&ter, distnbuer ou vendre des copies de cette these sous la forme de microfiche/^ de reproduction sur papier on sur format eIectronique .
L'auteur conserve la propnete du droit d'auteur qui proege cette these. Ni la these ni des extraits substantiels de celle-ci ne doivent &re imprim6s on autrement reproduits sans son autoxisation.
ABSTRACT
Bearing condition monitoring and fault diagnosis have been studied for many years.
Popular techniques include those through advanced signal processing and pattern
recognition technologies. Recently, some interesting results were published using pattern
recognition for bear& diagnosis by means of feahms extracted from vibration signals
through time domain and kquency domain analyses [Sun, et al, 19981. In this work,
segmentation parameters are proposed to f i d e r improve the sensitivity and reliability of
the technique. Parameters extracted from the segmentation analysis reflect the variation
of vibration signals associated with the bearing dynamics. A three-layered artificial
neural network is applied to accomplish the non-linear mapping fkom the feature space to
the two dimensional classification space. The mapping is conducted to create the best
cluster effect for training samples belonging to the same class. Successll non-linear
mapping through the neural network eliminates intra-class transformations as used in
[Sun, et al, 19981. Numerical experiments are performed to illustrate the effectiveness of
the method.
I am deeply indebted to Dr. Q. Sun, my supervisor, who has been a strong source of
inspiration throughout my project work. I have benefited greatly &om her invaluable
guidance and motivation. Her guidance has been very supportive in helping me complete
this project.
1 would also like to give a special thanks to the National Research Council of
Canada for its tinancia1 support and the Association of American Railroad for providing
the bearing testing &a
TABLE OF CONTENTS
. . .......................................................................................................... APPROVAL PAGE u
... mSTRA CT ...................................................................................................................... u1
ACKNOWLEDGEMENTS .Om.mO.mOm..O.....mO. ~ ~ ~ ~ m o ~ ~ w w ~ m ~ H ~ ~ ~ ~ ~ m ~ ~ ~ o ~ ~ m ~ e m ~ m ....mm...~w..~.....m~~wm~.m.~m~m~...m iv
TABLE OF CONTENTS ................................................................................................. v
. . LIST OF TABLES .......................................................................................................... vu ... LIST OF FIGURES ....................................................................................................... vrrr
CHAPTER ONE : INTRODUCTIONoooooooooooeoooooooooooooooooooooooooooooooooo.ooooo.o~oo 1
........................................................ 1 . 1 Machine Condition Monitoring and Diagnosis 1 1.2 Bearing Failure Modes ............................................................................................ 2
1.2.1 Fatigue ............................................................................................................ 2 1.2.2 Wear ................................................................................................................. 3 1.2.3 Corrosion .......................................................................................................... 4
......................................................................................................... 1 .2.4 Brinelling 4 ..................................................................................... 1.2.5 Lubrication Starvation 4
1.3 Dynamic Response due to Localized Fatigue Spalls ................................................ 5 .................................................................................................... 1.4 Vibration Analysis 6
.......................................................................................... 1 . 5 Vibration Measurement 8 1.6 Review of Vibration Analysis Techniques .............................................................. 9
1.6.1 Time Domain Techniques ............................................................................. 10 1.6.2 Frequency Domain Techniques ...................................................................... 1 1 1.6.3 Time-Frequency Analysis ............................................................................ IS
........................................................................................... 1.7 Objective of the Thesis 16 ...................................................................................... 1.8 Organization of the Thesis 18
CHAPTER TWO: BEARING KINEMATICS ................................. ....... 26
.................................................................................................... 2.1 Bearing Structure 26 ................................................................................................ 2.2 Bearing Kinematics 28 2.3 Vibration Models of Localized Fatigue S-g ................................................... 33
CHAPTER THREE: FEATURE EXTRACTION FOR PATTERN RECOGNITION ......................................................................................... 40
3.1 Feature Extraction .................................................................................................. 4 0 3.2 Time Domain Parameters ........................... .. ........................................................ 40
....................................................................... 3.2.1 Probability Density Function 4 1 ................................................................ 3.2.2 Root Mean Square and Peak Value 43
3.2.3 Statistical Parameters .................................................................................... 4 4 .............................................................................. 3.3 Frequency Domain Parameters 4 8 .............. ............................................ 3.4 Segmentation Analysis and Parameters ... 50
..................... ..................................................... 3.4.1 Segmentation Analysis ........ 50 3.4.2 Feature Extraction Using Segmentation Parameters ...................................... 54
CHAPTER FOUR: NEURAL NETWORKS FOR NONLINEAR MAPPING ................................................................................................... 69
............................................................................................................ 4.1 Introduction 69 ..................................................................................... 4.2 Artificial Neural Networks 70
4.2.1 Multilayer Feedforward Artificial Neural Networks ..................................... 73 ................................................. 4.2.2 Error Back-Propagation Training Algorithm 75
................................................................... .......................... 4.2.3 Convergence ... 81 ............................................................................................ 4.2.4 Stopping Criteria 83
4.2.5 Initial Weights and Cumulative Weight Adjustment ..................................... 84 ..................................... 4.3 Experimental Determination of Optimal Neural Network 85
4.3.1. Network Architectures with Optimal Hidden Layer ................................... 85 ..................... 4.3.2. Accelerated Convergence through Le-g-Rate Adaptation 86
.......................... CHAPTER FIVE : BEARING DEFECT DIAGNOSIS 96
.............................................................................................. 5.1 Experimental Studies 96 .................................................................................................... 5.2 Feature Selection 97 ................................................................. 5.3 Result of the Artificial Neural Network 99 ........................................................................................................ 5.4 Classification 101
................................................................................... .................. 5.5 Diagnosis ...... 102
CHAPTER SIX: CONCLUSION AND FUTURE WORK ......m..........m 1 1'8
.............................................................................. 6.1 S u w of Results Obtained 118 .................... 6.2 Limitations of the Present Method and Directions for Future Work 121
LIST OF TABLES
......................................................... Table 2.1 Bearing defect characteristic fkquencies 39
Table 3.1 a. Comparison of time domain pararneters for good bearing and defective
bearing ....................................................................................................................... 63
............................ Table 3.lb. Time domain parameters for bearings at different speeds 63
............................................ Table 32a. Frequency index for bearing in good condition 64
................................................ . Table 3.2b Frequency index for bearing with cup spalls 64
Table 33a . Time domain parameters of six segments for bearing with inner race defect
.................................................................................................................................... 65
Table 33b . Segmentation parameters (shaft fkquency) for bearing with inner race defect
................................................................................................................................. 65
Table 3.4a. Time domain parameters of six segments for bearing in good condition ..... 66
Table 3.4b. Segmentation parameters (shaft fkquency) for bearing in good condition .. 66
Table 35a . Time domain parameters of six segments for bearing with roller defect ...... 67
Table 3.5b. Segmentation parameters (cage fkquency) for bearing with roller defect ... 67
Table 3.6a. Time domain parameters of six segments for bearing with outer race defect
Table 3.6b. Segmentation parameters for bearing with outer race defect ........................ 68
Table 4.1 Performance comparison of hidden layer with different size .......................... 95
Table 5.1 Bearing conditions represented with class numbers ........................................ 96
..................................................................... Table 5 2 Bearing component dimensions 1 12
Table 5.3. Calculated 19 parameters for 1 15 samples .................................................... 1 13
Table 5.4 Normalized training sets ................................................................................ 1 14
Table 5.5. Network outputs compared with target outputs ............................................ 1 15
Table 5.6. Calculated parameters of 3 1 test data. ........................................................... 1 16
.......................................................... Table 5.7. Normalized parameters of 3 1 test data 1 17 vii
LIST OF FIGURES
Figure 1.1. Vibration signal h m a bearing with inner race defects ............................... 20
Figure 1.2. Comparison of two si@s with the same Peak values ................................. 21
Figure 1 3 . Power spectrum of a b e a ~ g with cone spalls .............................................. 22
Figure 1.4 Spectrum comparison .................................................................................... 22
.............................................................. Figure 1.5a . An impulse/time spectrum display 23
............................................................ Figure 1.5b. An envelope/tirne spectrum display 23
...................................................................... Figure 1.6. Short Time Fourier Transform 24
Figure 1.7. Wavelet Transform computed h m bearing vibration signals ...................... 25
.......................................................................... Figure 2.1 Angular-contact ball bearing 36
Figure 2 3 Tapered roller bearing ................................................................................... 36
................................................................ Figure 2.3 Standard fieight car rolling bearing 37
................................. Figure 2.4 Schematic of a rolling element angular contact bearing 38
......................................................................... Figure 3.1.. b . Bearing vibration signals 56
........................................................................... Figure 3.2. Probability density function 57
.............................................. Figure 33a. b . Frequency spectra of the vibration signals 58
Figure 3.4.. Vibration signal for inner race defect in one second ................................... 59
Figure 3.4b . Vibration signal for inner race defect in one shaft revolution ..................... 59
....................................................................................... Figure 3.5 : Bearing under Load 60
......................................... Figure 3.6 a. Vibration signal from bearing with roller defect 61
Figure 3.6b. Signal from bearing with roller defect over one cage revolution ................ 61
............................... Figure 3.7 a. Vibration signal for outer race defect over one second 62
................. Figure 3.7b. Vibration signal for outer race defect over one cage revolution 62
Figure 4.1 Architecture graph of a multiplayer neural network with two hidden layers . 90
.............................................................................................. Figure 4.2 A neuron model 90
............................................................................ Figure 4.3 Sigmoid activation function 91 vi i i
Figure 4.4 Illustration of the ~ o n s of two passes: ..................................... ;.. ............ 91
.................................................................... Figure 4.5 A three-layered neural network 92
.................................................... Figure 4.6 Scheme 1 : incremental updating flowchart 93
.............................................................. Figure 4.7 Scheme 2: batch updating flewchart 94
.......................................... Figure 4.8 The neural network used for nonolinear mapping 95
........................................................................................ Figure 5.1 Multiple cup s p d s 104
Figure 5.2 Multiple cone spalls ...................................................................................... 104
................................................................................................. Figure 5 3 Broken roller 105
....................................................................... Figure 5.4 Roller bearing test rig at TTC 105
............................................................... Figure 5.5. Roller bearing mounted in test rig 106
........ Figure 5.6. Frequency spectrum versus bearing defect characteristic frequencies 106
Figure 5.7 a. An time/fkquency display of the signal in one revolution ....................... 107
........................................................... Figure 5.7b. Frequency spectra of each segment 108
Figure 5.8 Result of nonlinear mapping using neural networks .................................... 109
.............................................. Figure 5.9 Cluster centers evenly spaced on a unit circle 110
..................................................... Figure 5.10 Cluster centers arrayed on a unit square 110
............................................................ Fi y re 5.1 1 Classification and diagnosis results 1 1 1
CHAPTER ONE
INTRODUCTION
1.1 Machine Condition Monitoring and Diagnosis
Nowadays, manufacturing companies are making great efforts to reduce costs and
improve quality in order to maintain their competitiveness in the global marketplace. It is
recognized that significant cost savings and profitability can be achieved by higher
equipment availability, reliability, and maintainability. In order to accomplish this goal, it
is necessary to implement an effective machinery maintenance program wuang, et al,
1 9961.
The most important and expensive task in terms of labor time and cost in machinery
maintenance is fault detection and diagnostics. Without accurate identification of
m a k e faults, maintenance and production scheduling c a ~ o t be effectively planned;
the necessary repair work cannot be optimally scheduled. In addition, accurate fault
detection and diagnosis is essential for reducing troubleshooting and repair time. As a
result of correct and fast fault diagnosis, machine availability may be improved
significantly.
Bearings are essential components of most rotating machinery. The majority of the
problems in rotating machines are caused by faulty bearings pi, et al., 19891. Over the
last 30 years k ight cars have been equipped with tapered roller bearings. The railroad
industry suffers damages to equipment, wayside structures, and lading every year due to
derailments caused gy catastrophic wheel-bearing failure. Several wayside inspection
techniques are employed by railroads to identify defective bearings prior to failure.
Improving the reliability of bearing fault detection and diagnostics will reduce the
potential for derailment due to catastrophic bearing failure and enhance railroad safety.
2
The American Federal W o a d Administration (FRA) has focused its research
efforts on improving railroad safety. The current research is motivated by the interest of
the FRAY Transport Canada and the National Research Council of Canada on the
development of a technique to achieve the following objectives:
1. Reliably detect spalled race defects.
2. Reliably detect broken roller defects.
3. Reliably determine and indicate defect severity.
4. Significantly reduce system component maintenance requirements.
The bearing inspection systems currently used in railroad industry often fail to detect
overheated roller bearings. Other techniques based on processing the vibration signals
generated h m bearings, including time domain analysis, frequency domain analysis and
time-frequency analysis have also been studied for bearing fault detection and diagnosis.
However, none of the existing techniques can achieve the above objectives consistently,
which prompts the need for fkther investigation and development of the wayside bearing
defect diagnosis system.
1.2 Bearing Failure Modes
The normal service life of a rolling element bearing rotating under load is
determined by material fatigue and wear at the running surfaces. Premature bearing
failures can be caused by a large number of factors, the most common of which are
fatigue, wear, corrosion, brineiling and poor lubrication woward, 19941. The following
sections discuss the common modes of bearing failure.
1.2.1 Fatigue
A bearing subject to alternating normal loads could fail due to material fatigue after
a certain operation time. Fatigue damage begins with the formation of minute cracks
3
below the bearing surface. As loading continues, the cracks progress to the surface where
they cause material to break loose in the contact areas. The actual failure can manifest
itself as pitting, spalling or flaking of the bearing races or rolling elements. If the bearing
continues in service, the damage will spread in the vicinity of the defect due to stress
concentration. The surface damage severely disturbs the nominal motion of the rolling
elements by introducing short time impacts repeated at the characteristic rolling element
defect frequencies. As the damage continues to spread the repetitive nature of the impacts
will diminish as the motion of the rolling element becomes so irregular and disturbed that
it is impossible to distinguish between individual impacts. If the bearing were to continue
in service, the damage may spread to other raceways or rolling elements and eventually
lead to increased fiiction and temperature followed by complete seizure.
1.2.2 Wear
Wear is another common cause of bearing fdure. It is caused mainly by dirt and
foreign particles entering the bearing through inadequate sealing or due to contaminated
lubricant. The abrasive foreign particles roughen the contacting surfaces giving a dull
appearance. Severe wear changes the raceway profile, alters the roiling element profile
and diameter, and increases the bearing clearance. The rolling friction increases
considerably and can lead to high level of slip and skidding. The end result of this is
complete breakdown. Increasing wear will gradually introduce geometric errors in the
bearing. Non-uniform diameters of worn rolling elements will cause cage fkequency
vibration and harmonics to be produced m e , 19891 as the sequence of balls rotating
through the load zone is periodic with the cage rotation frequency. Geometric errors of
the raceways will resuit in the production of multiple harmonics of shaft speed being
produced.
4
1.2.3 Corrosion
Corrosion damage occurs when water, acids or other con taminants in the oil enter the
bearing assembly. This can be caused by damaged seals, acidic lubricants or
condensation which occurs when bearings are suddenly cooled fiom a higher operating
temperature in very humid air. The result is rust on the running surfaces which produces
uneven and noisy operations as the rust particles interfa with the lubrication. The rust
particles also have an abrasive effect which generates wear. The rust pits also form the
initiation sites for subsequent flaking and spalling.
1.2.4 Brinelling
Brinelling, manifests itself as regularly spaced indentations distributed over the entire
raceway circumference, corresponding approximately in shape to the Hertzian contact
area. Three possible scenarios causing brinelling are (1) when a bearing is subjected to
static overloading which leads to plastic deformation of the raceways, (2) when a
stationary rolling bearing is subjected to vibration and shock loads and (3) when a
bearing forms a loop for the passage of electric current. In all cases, the result will be
repetitive indentations of the raceways. In some instances, a large number of indentations
may occur as the bearing may occasionally be tumed slightly. The bearing operation will
be noisy and uneven in the presence of briwlling with each indentation acting like a
small fatigue site producing sharp impacts with the passage of the rolling elements.
Continued operation will lead to the development of spalling at the indentation sites.
1.2.5 Lubrication Starvation
Inadequate lubrication, either in tenns of quantity or quality, is one of the common
causes of premature bearing failure as it leads to skidding, slippage and bearing seizure.
At the highly stressed region of Hertzian contact, when there is insufficient lubricant, the
5
contacting SUtfaces will weld together, only to be tom apart as the rolling element moves
on. The three critical points of bearing lubrication occur at the cage-roller interface, the
roller-race interface and the cage-race interface. Lubricant starvation or improper
lubricant selection can have severe consequences as high temperatures can anneal the
bearing elements and reduce hardness and fatigue life. Eventually, bearing elements will
experience excessive wear which could cause catastrophic failure.
1.3 Dynamic Response due to Localized Fatigue Spalls
Rolling element bearings often have a tendency to fail by fatigue rather than wear-out
due to the low wear rate and high roller-rate contact load [Braun, et al., 19791. Since the
primary mode of bearing failure is due to localized fatigue spalling of bearing elements
pi, 1989; McFadden, 19901, this work focuses on dealing with fatigue spalls.
An undamaged bearing under load is subjefted to complex forces and moments.
These include static forces such as shaft loads and preloads, dynamic forces due to
centrifugal loads, fluid pressure, traction and fiction. For a good bearing operating at a
constant shaft speed and load, al l forces are in quasi-equilibrium.
When a rolling element encountem a defect on the bearing surface, a rapid localized
change in the elastic deformation of the elements takes place, and a transient force
imbalance occurs. The transient forces will then result in rapid accelerations of the
bearing components. Complex motions can occur such as oscillatory contact and impacts
between the roller and raceway, roller and cage, and cage and raceway as well as
skidding or slipping of the roller and cage.
Construction of dynamic models describing the bearing motion caused by defects has
been attempted. Gupta developed models that incorporate localized changes to the motion
of the raceways and rolling elements [Gupta, 1975; 1979a; 1979b; 1979c; 19811.
However, experimental verification of the model was only performed with Limited
6
examples [Gupta, et al., 19851. Measuring the motion of bearing components, such as
cage angular velocity, roller linear and angular velocity, etc., is an extremely difficult
task and prone to errors due to the inaccessibility of the bearing components.
For most rotating machinery, detecting the presence of a damaged bearing is not
sufficient. It is more important to determine the extent of the damage and its effect upon
bearing Life. Inspection of bearings removed from service with the existing wayside
inspection techniques showed that, in some instances, the defects present in the bearings
were not condemnable under cumnt Association of American Railroads guidelines for
reconditioning roller bearings. Furthermore, it is a common belief in the railroad industry
that such "minor" defects could survive for the remaining Life of the adjacent wheels and
that the removal of bearings with such defects is considered to be economically
disadvantageous with little or no net safety improvement plorom, 19941.
Fatigue in rolling element bearings is caused by the application of repeated stresses
on a finite volume of material. Because bearing materials are not homogeneous or equally
resistant to failure at all points, it occurs at the weakest point of the material. Therefore, a
group of supposedly identical specimens will exhibit wide variations in failure times
when operated under the same conditions. However, improvement in bearing materials,
lubrication and manufacturing technology has led to a large increase in bearing life and
reliability.
1.4 Vibration Analysis
Currently, there are two kinds of bearing inspection systems being used in railroad
industry: the Hot Box Detector (HBD) and the Acoustic Based Detector (ABD). The
HBD system uses wayside rail-mounted infrared (IR) transducers to monitor bearing
temperature as the train passes by the detector. The system issues an alarm if the bearing
temperature exceeds a preset limit. Such a system was originally designed for monitoring
7
fiction bearings. However, over the last 30 years, freight cars have been equipped with
tapered roller bearings. When the catastrophic failure of roller bearings happens, bearing
temperature increases within short period of time followed by axle journal bum-off.
Consequently, the HE3D often misses overheated roller bearings [Choe, et al., 19971. This
has a detrimental effect on the safety and efficiency of railroad operations.
In the late 70's, Acoustic Based Detectors were commercialized and applied for
wayside bearing inspections. Since then, there has been an increasing interest and
demand for the development of efficient and reliable devices based on acoustic sensory
signals. Existing ABD techniques are shown to be too sensitive to bearing incipient
damages and therefore are often over-safe. It becomes evident that advanced signal
processing techniques are desirable.
Machine vibrations are due to cyclic excitations to the machine. The excitation loads
exist during normal machine operation or could be due to changes in the dynamic
properties of the machine, such as certain component failure. These excitation forces are
transmitted to adjacent components or adjoining structure, causing parts of the machine
to vibrate at different resonance kquencies. A change in the vibration signal not only
indicates a change in machine conditions, but also oflen points to the problem. When a
machine is operating properly, vibration is small and constant. Faulty components usually
cause significant changes in machine dynamics leading to much higher vibration energy
levels with different patterns. The amount of information contained in the measured
vibration signals is immense.
The use of vibration measurement as a diagnostic tool is well established in various
engineering disciplines Fiu, et al., 19921. This non-intrvsive technique can be easily
applied to monitoring machinery conditions without interfering with machine operation.
It may be used to gain information about subsystems which are otherwise inaccessible.
The ability of vibration based techniques to detect and diagnose a broad range of faults in
8
a wide array of machine elements is one reason that it is often chosen as a preferred
method. The technique is non-intrusive and cost-effective, which makes it more attractive
for condition monitoring wechefske, et al., 199 13.
Vibration monitoring of rolling element bearings has consistently produced good
results because of developments in signal processing techniques. Pattern recognition
techniques have been investigated [Batchelor, 1978; Sun, et al., 1997, 1998; Wang, et al.,
19981 and shown to have the ability to deal with various machine operating conditions. In
this thesis, we further pursue pattern recognition analysis with the objective of increasing
reliability and sensitivity of the method.
1.5 Vibration Measurement
The success of any monitoring program largely depends on the accuracy of the
measurement. Given that the instnunentation is properly calibrated, measurements are
accurate when the sensor mounting does not limit the kquency and dynamic ranges of
the sensor and when measurements are always collected at the same locations
[Alguindigue, et al., 1993).
The measurement of machine vibration can be made using a wide array of
transducers. Microphones measuring the acoustic response of the machine have been
shown to provide useful diagnostic information [Smith, et al., 1988; Jammu, et al., 19971.
They can be used for non-contact vibration measurement and are inexpensive. The
parabolic microphone in particular has been shown to be effective as a remote acoustic
monitor of rolling element bearings and has been used effectively on railcar bearing
detection and diagnosis [Smith, 1988; Smith, 19921. By locating the microphone
statically at a distance of approximately 20 feet Grom the train, the parabolic microphone
is capable of eliminating off axis sound and concentrating on the direct sound. The main
drawbacks with microphonic recording systems are that their frequency response is
9
limited to the audible range, and that they are relatively insensitive to very low kquency
signal components.
The piezo-electric accelerometer which measures the acceleration of vibrations is
probably the most popular measurement transducer for vibration analysis in use today.
They have light weight, good temperature resistance, and wide frequency response and
dynamic range Fiathew, 19891. The hquency response is limited by the natural
fiequency of the system, and operation is usually limited to about 20%-30% of the
natural hquency of the transducer. The acceleration signals obtained from these
transducers are sometimes integrated to produce velocity or even displacement for
different applications. These signals are then processed in diverse ways to highlight
various aspects of the signal which can then be used in the detection and diagnosis of the
machine condition.
Velocity transducers measure the velocity of the machine casing to which they are
attached. They are capable of measuring down to almost DC. They have not found wide
acceptance for bearing fault detection as the frequency range available with
accelerometers is wider. A number of laser velocity measurement systems are also
available where the surface velocity of the machine is measured by the laser using the
Doppler shifting principle [Smith, 19921. The data used in this work were collected using
microphones and accelerometers.
1.6 Review of Vibration Analysis Techniques
The vibration signal obtained from operating machines contains information relating
to machine condition as well as noise. Further processing of the signal is necessary to
elicit idormation particularly relevant to bearing faults. Many techniques have been
employed to process the vibration signals in bearing fault detection and diagnosis. Three
10
common techniques, time domain techniques, frequency domain techniques and time-
fkquency analysis will be briefly reviewed.
1.6.1 Time Domain Techniques
Time series of the signal, if understood properly, can yield enormous amounts of
information. Further analysis is usually carried out so that important characteristics not
readily observed can be highlighted. Several techniques used in machine monitoring are
explained in the following paragraphs.
The most straightforward technique is simply to v i d y inspect portions of the time
domain waveform. Figure 1.1 shows vibration signal waveform for one second duration
obtained from a bearing containing inner race defects. The signal is digitized with a
sampling rate of 27WIz. Repetitive impacts can be observed at the time period
corresponding to the time interval Tbpfi (revolution of ball passing inner race) when
rolling elements pass the race damage, as indicated in Figure 1 .I . It can also be observed
that these repetitive impulses are modulated with the inner race rotation and therefore
similar patterns are repeated in every revolution of the inner race, Ti. If we zoom in to
look at the signal in 0.1 second, we can see three impacts which are spaced at Tbpn in
every period of inner race revolution Ti. Each impact excites the resonance of the
structure which rapidly decay due to the system damping. It should be pointed out that
bearing vibration signals do not always present the impacts so clearly. The total vibration
signal produced by a large machine containing many components may be very
complicated when viewed in the time domain, making it unlikely that a spalling defect in
a bearing may be detected by a simple visual inspection of the vibration wavefonn
WcFadden, 19901. Impulses are often masked by vibrations &om other components and
background noise. More sophisticated time domain techniques are desirable such as
through trending certain characteristic parameters.
11
The vibration si@s generated from bearings mounted on the railway k igh t car are
normally non-deterministic and non-stationary. Commonly used time domain parameters
are determined through the probability density distribution. These are Peak (Pk), Root
Mean Square value (RMS), Crest factor (Cf), Kurtosis value (Kv), Clearance factor (CU),
and Impulse factor (If). Peak and RMS can directly reflect the energy level of the
vibration. Cf and Kv can be used to indicate the spikiness of the signal associated with
the defect-induced impulses.
Suppose we have signall and signal2 with the same Peak value. If signall has higher
energy but is less spiky and signal2 is spikier with lower energy level, then we should
observe that signall has greater RMS value and signal2 has larger Cf and Kv. Figure 1.2
shows the waveforms of signdl and signal2 and values of these parameters are shown in
the caption.
Crest factor (Cf), kurtosis value (Kv), Clearance factor (Clf) and Impulse factor (If)
are non-dimensional statistical parameters. They are very effective in indicating incipient
fatigue spalling. But sometime these parameters fail to indicate the defects due to the
development of the failure. For example, if the defect becomes severe, Cf and Kv will
reduce to normal values. Therefore they are not very reliable and cannot be used in
isolation. Moreover, they cannot be used to directiy indicate the location of the defect.
1.6.2 Frequency Domain Techniques
Discrete Fast Fourier analysis of the time waveform has become the most popular
method of deriving the frequency domain signal. The signature spectnun so obtained can
provide valuable information with regards to machine conditions Flathew, 19891.
Spectral analysis and spectrum comparison are commonly used frequency domain
techniques. Envelope analysis or demodulating the time waveform prior to performing
the fast Fourier transform is also gaining popularity.
12
Spectral analysis is a very common technique when analyzing the vibration signal in
the fiequency domain [EshIeman, 1980; Taylor, 1980; McFadden, et al., 1984a;
ALfredson, et al., 1985a; Bannister, 1985; Igarashi, 1985; McFadden, et al., 1985;
Mechefske, et al., 1992; Mechefske, 19931. Power spectrum can be used to identify the
location of the defect by relating the defect characteristic frequencies to the major
frequency components in the spectrum. For bearing fault detection and diagnosis, a
detailed knowledge of the bearing defact characteristic fiequencies is required, which will
be discussed in chapter 2. Figure 1.3 shows the power spectrum of a bearing with an
inner race defect. We can see a dominant peak at the kquency of about 102Hz, which is
very close to the roller passing inner race fiequency (99Hz). It indicates the defect is on
the inner race.
Automatic detection of impulses at bearing characteristic fkequencies is not a simple
task for it involves searching of the spectnun for specific fiequencies related to all
relevant components including harmonics and sidebands of any of the defect frequencies
[Shi, et al., 19881. Difficulties lie in the fact that the energy of the bearing vibration is
spread across a wide frequency band and can be easily buried by the noise. The result is
that various resonances of the bearing and the surrounding structure will be excited by the
defect-induced impact.
Spectrum comparison has also been investigated for the purpose of signature analysis
mdall, 1985; Semdge, 1991; Succi, 19911. A baseline spectrum is taken when the
bearing is in good condition. The difference between the baseline and subsequent signal
spectrum is used to highlight changes in mechanical condition. The comparison is used to
locate those fkquencies in which significant increases in magnitude have occurred.
Figure 1.4 shows how spectral comparisons are performed. The reference spectrum is
first established as baseline for the bearing in good condition. When a new signal has
been recorded, the fhquency spectra can be calculated and compared with the reference.
13
By subtracting the two at identical frequency lines, a 'difference' spectnun is obtained.
Decision-making can be done based on the difference spectrum.
Often incipient damage in rolling element bearings cannot be detected using spectrum
comparisons as the energy contributions of fault related impulses are relatively
insignificant compared to that of overall machine component vibration and noise.
Spectrum comparison is not suitable for our case since no baseline idomation would be
available for a passing train. Furthermore, railway bearings operate in a highly non-
stationary environment. Spectra of bearing vibrations are dependent upon the bearing
loads and speed, which may vary over large ranges.
Envelope d y s i s is another popular method used in detecting incipient failure of
rolling element bearings. It was developed in the early 1 970's by Mechanical Technology
Inc. and was originally called the high kquency resonance technique WcFadden, et al.,
1984% 1984b, 1985; Prashad, 1985; Howard, et al., 1989; McFadden, 1990; Su, et al.,
19921. It has also been known by a number of other names including amplitude
demodulation v t e , 199 11, demodulated resonance analysis and narrow band envelope
analysis WcMahon, 199 1 ; Mundin, et al., 1992; Azovtsev, et al., 19943.
Fundamental to envelope analysis is the concept that each time a localized defect in a
rolling element bearing makes contact under load with another component in the bearing,
an impulse vibration is generated. The impulse will have an extremely short duration
compared to the interval between impulses, and so its energy will be distributed across a
very wide frequency range. The result is that various resonances of the bearing and the
surrounding structure will be excited by the impacts. The excitation is repetitive because
contact between the defect and the mating surfaces in the bearing is essentially periodic.
The frequencies of occurrence of the impulses are referred to as the characteristic bearing
defect frequencies. Structural resonance of the bearing and its housing components can
be considered as being amplitude modulated at the characteristic defect frequency, which
14
makes it possible not only to detect the presence of the defect but also to identify the
location of the defect. Envelope analysis provides a mechanism for extracting the
periodic excitation or amplitude modulation of the resonance woward, 1 9941.
The envelope analysis involves passing the band-pass filtered signal through a half-
wave rectifier and then through a low-pass filter. The half-wave rectifier removes either
the positive or negative excursions of the signal to leave a succession of unipolar pulses,
still at the resonant frequency. Low-pass filtering removes the resonant fiequency and
smoothes these pulses. Figure 1.5a [SKF, 19931 shows the relationship between a time
domain repetitive impulse signal and its FFT spectrum conversion. It is clear from the
fiequency spectrum that there is a dominant impulse component at about the fiequency of
50Hz with exponentially decaying magnitude. Such impulses are also modulated by a
signal with lower fiequency (-Sk). However, when inspecting the frequency spectrum
in Figure 1.54 the 5OHz component was indicated by the peak but the S H z component
was buried in the sidebands. In the application of machine condition monitoring,
detection of both components is important, and detecting the lower fiequency component
may become more critical. This could be achieved through wrapping the signal by the
envelope as shown in Figure 1.5b and the subsequent spectrum shows dominant
components at the fkquency of SHz.
Successll application of envelope analysis requires knowledge and experience in
locating the existence of the camer frequencies before band pass filtering can proceed.
Kurtosis value has been suggested as an aid to identify a suitable carrier frequency
pamister, 19851, but it should also be recognized that the machine being monitored
might not even contain carrier fkquencies. Moreover, envelope analysis is not suitable
for detecting extensively damaged bearings. A more severely damaged bearing presents a
more random dynamic response to impacts and thus makes it difficult to identify the
carrier fiequency. It may be reasonable to consider extensively spread spalls as the sum
15
of smaller ones, each of which will produce an envelope spectrum. When many spectra
are summed together? some components may cancel each other while others may be
reinforced due to the difference in phases. Consequently, modulation sidebands may
dominate the envelope spectrum instead of the hdamental impact frequencies and their
harmonics.
1.6.3 Time-Frequency Analysis
A number of time-frequency techniques have been developed for a n a l y ~ g non-
stationary signals. Among those, the Short Time Fourier Transform (STFT) and Wavelet
Transform (WT) are widely used. The STFT uses sliding windows in time to capture the
frequency characteristics as functions of time. Therefore, spectra are generated at discrete
time instants. Three-dimensional display is required to describe frequency, magnitude,
and time, as shown in Figure 1.6. In the figure, magnitude is represented with different
gray scales. The greater the magnitude, the darker the image.
We can observe from Figure 1.6 that impacts occur at various times with different
frequency spectra. Two transient impacts can be seen corresponding to two instants when
balls pass over a defect on the inner race. The time period between impacts can be used
for diagnosis by relating it to the various bearing characteristic frequencies. An inherent
drawback with the STFT is the limitation between t h e and frequency resolutions. A
finer frequency resolution can only be achieved at the expense of time resolution and
vice-versa. Furthermore, this technique requires large amounts of computation and
storage for display.
The Wavelet Transform 0, on the other hand, is similar to the STFT in that it also
provides a time-frequency map of the signal being analyzed. The improvement that the
WT makes over the STFT is that it can achieve high frequency resolutions with sharper
time resolutions. Figure 1.7 shows the continuous wavelet transform of 10.2 milliseconds
16
of vibration signal measured from the test rig. Figure 1.7a relates to bearings in good
condition and Figure 1.7b shows the WT for the vibration signal of bearings with the
inner race defect.
As shown in Figure 1.7% the wavelet transform of signals from a bearing in good
condition is dominated by the low fkquency resonances below approximately 8kHz
which are excited by the gear mesh harmonics. No periodic structure is apparent over the
short time segment which is being considered- For the bearing with the inner race defect,
the high frequency region of the WT as depicted in Figure 1.7b, is dominated by the
excitation of the structural resonances as each rolling element in the load zone encounters
the inner race defect The wavelet transform provides a clear indication of the leading
edge of each impulse and then the subsequent damped oscillation across a wide range of
high freguencies. Although the WT provides a sharp indication of the time response at
higher frequencies, the corresponding fiequency response is not very- clear and it
becomes difficult to detect the individual structural resonances excited by the defect-
induced impact.
1.7 Objective of the Thesis
For the development of a reliable wayside bearing inspection system, it is desirable
that the system be sensitive to bearing signature but robust to the fluctuation in operating
and environmental conditions. Pattern recognition techniques have been investigated in
recent years as an effort to improve the reliability of fault detection and diagnosis
patchelor, 1978; Sun, et al., 1997, 1998; Wan& et al., 19981. In particular, Sun et al.
[Sun et al., 1997, 19981 proposed to extract feature parameters through time domain and
frequency domain analysis of bearing vibration signals. Pattern recognition is then
achieved based on the extracted features. It is shown that their method has the ability to
address the following issues:
17
1) Severity of the bearing damage: This is realized by using feature parameters
representing the vibration energy levels due to increased bearing damage;
2) Robusmess to bearing loads and rotating speeds: Such time domain parameters as
Peak and RMS values are normalized by the baseline RMS values representing
good bearings. Through in-line measurement of bearing vibrations, the baseline
RMS value is obtained by averaging the vibration signals over all the bearings
mounted on the freight train. It counteracts the effect of fluctuating bearing loads
and rotating speeds, as well as environment conditions;
3) Location of bearing fatigue spalls: Frequency index was proposed to capture the
dominant vibration components in the spectrum that may be associated with
particular bearing characteristic kquencies. The latter can be used to identify the
location of bearing defects.
Above all, their method is shown to be simple to implement and does not require human
operator's knowledge of signal analysis. This is an important feature when it comes to the
development of automatic diagnosis systems. In this thesis, we intend to pursue further
improvement on the pattern recognition technique proposed by Sun et al. Two endeavors
will be attempted along the same h e . Firstly, more feature parameters will be
investigated to highlight the time and frequency relations according to the bearing
dynamics. The purpose is to further improve the sensitivity and reliability of the
technique. This will be achieved through segmentation analysis. Secondly, a nonlinear
mapping between the feature space and the classification space will be explored for the
purpose of dimension reduction to facilitate piecewise linear class~cations. In [Sun, et
al., 19981, a linear projection based on the least squared principle is applied. This must be
followed by intra-class transformation due to the poor performance of the linear mapping.
In order for the linear classification to be applicable, successful mapping is considered
18
when data points belonging to the same class are clustered together without overlapping
other classes in the classification space. A multi-layered artificial neural network will be
investigated for use in performing the high fidelity dimension reduction nonlinear
mapping. The intra-class transformation can be eliminated thereby.
The same experimental data as used by Sun et a1 will be used. Studies will also be
conducted to compare the results and show the improvements achieved by this work.
1.8 Organization of the Thesis
The thesis is organized into six chapters. In chapter 2, bearing structure, kinematics
and vibration models are described. An understanding of bearing geometry and
kinematics is essential for bearing fault detection as it determines the bearing defect
characteristic frequencies. Vibration models of localized fatigue spalling are also
presented in this chapter.
Chapter 3 focuses on feature extraction to construct the feature space for the pattern
recognition of bearing conditions. Vibration signals obtained fkom bearings are digitized
and processed through time domain, fkquency domain and segmentation analyses. Time
domain analysis can be conducted by calculating the statistical parameters and the energy
level of the vibration signals. Frequency domain are also introduced. The
newly developed parameters based on segmentation analysis are presented as well.
In chapter 4, artificial neural networks are discussed. They are used to perform the
nonlinear mapping from the feature space to the classification space. The basic
architecture of artificial neural networks is discussed, with a focus on the multi-layered
feedforward ANNs. The error back-propagation training algorithm is explained in detail.
The convergence speed, stopping criteria and adjustment of initial and cumulative
weights of the neural network are also discussed. The optimal architecture employed for
19
the ANN used in this work is presented. The learning-rate adaptation theory and its
application to convergence improvement are also discussed.
Chapter 5 presents experimental studies. A total number of 115 samples provided by
AAR were used for training. A total number of 3 1 test data (not used in training) from
bearings with defects of different types were used to test the effectiveness of the
developed method. Comparisons were made with and without the segmentation
parameters. It is shown that segmentation parameters increase the sensitivity and
therefore the reliability and efficiency of the pattern recognition technique.
Chapter 6 concludes the thesis with a summary of the results obtained, a discussion of
the limitations of the proposed methods, and proposes some directions for W e work.
Figure 1.2. Comparison of two signals with the same Peak values Signall has higher energy and signal2 is spikier
Signall: Peak value - 40, RMS - 39.5326, Crest factor - 1.01 18, Kurtosis value - 1.0024; Signal2: Peak value - 40, RMS - 17.3845, Crest factor - 2.3009, Kurtosis value - 3.8278
Fiyre 1.7. Wavelet Transform computed h m bearing vibration signals a) Bearing in good condition b) Bearing with inner race defect
CHAPTER TWO
BEARING KINEMATICS
2.1 Bearing Structure
Rolling element bearings can be grouped into two main types: (a) the ball bearing,
which has point contact; and @) the roller bearing, which provides line contact on both
the raceways. In general, a ball bearing comprises four principle parts - an inner ring or
race, an outer ring or race, a ball complement, and a ball separator or cage. The inner race
is fastened to the shaft and is grooved on its outer diameter to provide a circular ball
raceway. The outer ring is mounted in a housing and contains similar grooved circular
ball raceway on its inner diameter. The balls serve to space the inner and outer raceways
apart and provide for smooth relative motion between them. The cage serves to keep the
balls uniformiy spaced in the bearing, preventing them h m rubbing on each other or
bunching on o w side of the bearing. Normally, the inner race carries the rotating
element, but in some applications the inner race may be stationary and the outer race may
carry the rotating element. Figure 2.1 shows a cutaway view of a typical angular-contact
ball bearing. The angular-contact ball bearing is specially designed to carry a heavy thrust
load in one direction. This ability is obtained by including the largest possible number of
b d s by providing high shoulders on one side of the raceway, and by designing the
bearing so that a large angle of contact exists between the balls and the races [Wilcock, et
al., 19571.
Roller bearings are chosen when the loadcarrying capacity of similar-sized ball
bearings is inadequate, as they (roller bearings) have greater resistance to fatigue and
suffer less fiom deflection for a given load. When heavy loads are to be supported, the
multi-row type of unit is chosen, and the rows per Jet of bearing may be two or four
[Houghto~~ 19651. Similarly, a roller bearing consists of four principle elements - an
inner race, an outer race, a complement of rollas, and a separator or cage for the rollers.
In some cases, the inner race is made an integral part of the shaft instead of a separate
member which is mounted on the shaft. The outer race is normally mounted in a
stationary housing, although occasionally the inner race may be stationary while the outer
race rotates. The rollers and the cage perform similarly with the balls and the cage of ball
bearings.
The bearing studied in this work is a tapered roller bearing. A typical tapered roller
bearing is illustrated in Figure 2.2. In tapered roller bearings, the rollers are in the shape
of truncated cones. They are mounted in the bearing on an angle as shown in Figure 2.2
in such a way that the axes of al l the rollers meet on a point on the center line of the
bearing and the shaft. This type of bearing can carry heavy loads in both radial and axial
directions but must be mounted in very carefbl alignment [Wilcock, et al., 1 9571. Typical
applications include automobile and other heavy-duty wheel bearing.
Tapered roller bearing was introduced into fieight cars in the United States in 1954.
The most common design found in service on today's U.S. railroads is the double row
tapered roller bearing which is shown in Figure 2.3. The stationary raceways are located
in the outer ring, which is commonly referred to as the cup. The rotating raceways are
located in the roller assemblies, which are commonly referred to as the cones. The
raceways of the cone and cup form a conical section where the extended lines of contact
of the rolling elements and the track surfaces intersect on the axis of bearing rotation. The
roller elements ride on the rotating raceways, and each roller is separated from adjacent
rollers by the cage assembly. The cone bore 'diameter is manufactured to be 0.0025 inch
to 0.0045 inch smaller than the axle journal, which tesults in an intefference fit between
the cones and the journal when the bearing is mounted. The two cones are separated by a
spacer ring which sets the amount of bearing endplay. Two grease seals, which press into
28
the cup and ride on the wear rings, act to retain the bearing lubricant and prevent
lubricant contamhation. The bearing is held in place on the axle journal by an end cap
assembly which includes three cap screws.
2.2 Bearing Kinematics
Because the contacts between defects and the mating surfaces in the bearing are
essentially periodic, impulses will recur at regular intervals. The frequency of occurrence
of impulses is referred to as the characteristic defect fkquency.
Resonance is usually considered as being amplitude modulated at the characteristic
defect hquency. This is not sinusoidal modulation, as the leading edge comprises a very
sharp rise corresponding to the impact of the defect, while the decay is approximately
exponential, as the energy is dissipated by internal damping. The end result consists of
periodic bursts of exponentially decaying sinusoidal vibration. The frequency of the
vibration is the natural fkequency of the resonance, while the decay rate is determined by
the damping WcFadden, et al., 1984bJ.
An understanding of bearing geometry and kinematics is essential for bearing fault
detection as it determines the rotational speeds of the bearing elements with respect to
each other and the theoretical bearing defect characteristic frequencies.
A number of articles deal with bearing geometry and kinematics [Gustafsson, et al.,
1962; Howard, 19941. Figure 2.4 shows a schematic of a typical angular contact rolling
element bearing in the general case with rotating inner and outer races.
From the geometry, assuming a constant operating contact angle a, the pitch circle
diameter of the bearing D can be approximated by,
where Dj is the diameter of the inner ring and DO is the diameter of the outer ring. The
race diameters can be expressed in terms of the pitch circle diameter, contact angle and
ball diameter d to give,
The circderentid velocity of the bearing components can be derived in terms of the
angular velocity (rad/sec) and radius (m), assuming pure rolling conditions. The inner
race circumferential velocity is given by,
the outer race velocity is given by,
The circumferential velocity of the cage, V , is the average of the velocity of the inner
and outer races assuming no slip occurs,
Combined with Eqs. (2.2) - (2.5), the above quation becomes,
The cage kquency in Hz rather than velocity can therefore be determined by,
Eq. (2.8) is also referred to as the hdamental train hquency (FTF) for rolling
element bearings. In the case of outer race being stationary, Eq. (2.8) can be further
simplified to:
The rotation fkequency of the rolling elements with respect to the inner races is
calculated as:
With Z rolling elements, the expression for the b d pass kquency on the inner race
can be found using Eq. (2.1 1) to give,
Similarly with the outer race being stationary, this leads to the familiar expression for the
ball pass frepuency on the inner race,
The frequency of rotation of the rolling elements with respect to the outer race can
likewise be derived by,
With Z rolling elements, the expression for the ball pass fkquency on the outer race
becomes,
and when the outer race is stationary, this leads to the familiar expression for the ball pass
fkquency on the outer race,
32
The fkequency of the rolling elements spinning about their own axes can also be
derived. The frequency of spinning assuming no slip is given by:
Combined with Eqs. (2.2) and (2.1 I), the above equation becomes,
which is the general form of the ball spin kquency.
Eqs. (2.8), (2.1 l), (2.14) and (2.17) are the general forms of the bearing defect
characteristic fkquency equations presented in the literature assuming no slip and with
both races rotating. Slip only takes strong effect at high speeds and light loads, it is
relatively unimportant under normal conditions.
The derivation as illustrated in Figure 2.4 has assumed positive rotations to be
clockwise and negative rotations to be anti-clockwise. Therefore, as given in Eqs. (2.8),
(2.1 I), (2.14) and (2.17), a final negative value will denote anti-clockwise rotation of the
bearing components. These derived equations are used for calculating the defect
characteristic hquencies of roller bearings in this work as listed in Table 2.1
(dimensions of the bearing component are listed in Table 5.2). Although roller bearings
have somewhat different structure &om ball bearings, the basic principles can sti l l be
applied. Moreover, we apply pattern recognition technique to the bearing defect diagnosis
which has the ability to tecognize bearing conditioos even when the calculations are not
perfectly precise as long as all the calculations are based on the similar approximation.
33
2.3 Vibration Models of Localized Fatigue Spalling
The bearing kquency equations provide a theoretical estimate of the frequencies to
be expected when various defects occur on the bearing elements, based upon the
assumption that an ideal impulse will be generated whenever a bearing element
encounters the defect. Impulses are generated when localized bearing defects such as
fatigue spa11 occurs on the bearing components. The initial model of the vibration
generation mechanism was developed by McFadden WcFadden, et al., 1984al. It
considers the vibration produced as the rolling elements encounter the defect to consist of
a series of impulses representing the transient force imbalance to the machine structure.
As the shaft rotates, the impulses occur periodically with the characteristic frequencies
depending on the location of the defect. Defects on the inner race of a bearing with a
stationary outer race were considered assuming the bearing is operating under radial
loads. The resulting modulation of the impulses with shaft rotation as the defect rotates in
and out of the load zone was considered. By considering the response of a typical
structural resonance, the vibration measured fiom each impulse was assumed to take the
form of an exponentially decaying sinusoid. The resulting vibration as measured by the
transducer was shown to be a combination of the periodic impulses, modulation due to
rotation through the load zone and the exponential decay of the impulses due to internal
structural damping. The complete model was experimentally verified for an inner race
defect. Vibrations measured fiom a bearing test rig confirmed that for inner race localized
defects the predominant features consist of sh& fkquency harmonics, the ball pass
frequency on the inner race&¶ modulated by shaft kquencies, and multiple harmonics
thereof.
Su, et al [Su, et al., 19921 extended the original work by McFadden to characterize
the vibrations measured fiom bearings subjected to various loading conditions and with
defects located on various bearing components. The main development of the work was
34
the determination of the periodic characteristics of various loading and its influence on
the vibration. The effect is generally associated with the misalignment or dynamic
unbalance of the shaft the axial or radial loading, the preload and manufacturing
imperfections.
Su's work presented the main causes of periodicities and the resulting effect of
defects on the various bearing components. For a roller defect, the vibration pattern
produced is in some respects similar to that produced by a bearing with an inner race
defect as discussed above. The defective roller revolves with the cage frequency and the
defect contacts the inwr and outer race alternately. The relative angular frequencies
between the defective roller and the load will be the cage frequency&. The contact point
for the defect will move alternately from the inner race to the outer race at twice of the
ball spin fiequency 2x& Thus the predominant features in the vibration consist of cage
frequency harmonics, the roller defect frequencies, modulated by cage frequencies, and
multiple harmonics thereof.
For an outer race spall, with fixed outer race, the damage site remains in a fixed
position relative to distribution of load around the bearing. The resulting vibration will
not be modulated with either the shaft frequency& or the cage frequency/,. The impulses
occur periodically with the ball pass frequency on the outer race fipfo. However, if the
shaft has unbalance or the rollers have diameter errors, the periodic variation will occur
with the shafl fiequencyf, due to unbalance rotating at the shaft frequency or the cage
fiequencyf, due to a non-uniform load distribution rewolving with the cage assembly.
Having obtained a model to predict the possible bearing frequencies and harmonics
for the various types of localized fatigue damage, the pattem of expected fkequencies can
be searched for as part of routine bearing condition monitoring. Further work has shown
that the analysis of the magnitude of the defect frequencies relative to each other
improves reliability [Su, et al., 19931.
35
The modelling of bearing defects other than localized spalling has received little
attention. The relevant fkquencies which can occur are not readily apparent or
necessarily static in time. This makes detection and diagnosis of bearing damage using
frequency analysis difEcuit for all but the straightforward cases of fatigue spalling. In this
work, we focus on the localized bearing spaUing caused by fatigue. These widely
accepted vibration models prompt the development of some distinctive features which
will be detailed in chapter 3.
Table 2.1 Bearing defect characteristic kquencies
(Dimensions of bearing components are listed in Table 5.2).
CHAPTER THREE
FEATURE EXTRACTION FOR PATTERN RECOGNITION
3.1 Feature Ex~baction
One of the greatest problems encountered when applying pattern recognition
techniques to the analysis of vibration signals is deciding on the method of feature
extraction to be used. Extracting feature parameters from the measured data is most
critical for effective fault detection and diagnosis pnal, 19941. Diagnostics based on
pattern recognition become more efficient and precise if correct feature parameters are
employed. Therefore, feature extraction becomes a very crucial component. In feature
e m t i o n , the knowledge of the real system dictates the number of the feature space
dimensions. In other words, the better the system is known, the easier its monitoring and
diagnostics will be.
Ideally, features are selected so that they uniquely represent certain characteristics
of the system. However, the challenge lies in the fact that it is not always straightforward
how to select the feature panuneters. It also depends on the system we are dealing with.
Therefore, it is also desirable that the selected features are robust to noise and operating
conditions. In dealing with vibration signals, features can be extracted using various
signal processing techniques, such as time domain and fkquency domain analyses.
3.2 Time Domain Parameters
When fauit occurs in a bearing, abnormal behavior can be seen h m the vibration
signals, e.g. sharp impulses for incipient damages and higher energy level for more
developed defects. Figure 3.l(a, b) show vibration signals taken h m a bearing in good
condition and with defects nspectively. It can be seen that the vibration amplitude of the
41
defective bearing is much higher than the bearing in good condition. Time domain feature
extraction can be conducted by calculating the statistical parameters, which provides
information about probability density distribution that can indicate the spikiness of the
signal associated with the defect induced impulses. Peak and Root Mean Square values
are also included to indicate the severity of bearing defects [Sun, et al., 19991. These
parameters prove to be simple and effective in identifying bearing fault cawed by fatigue
spalls [Sun, et al., 1998; Sun, et al., 19991.
3.2.1 Probability Density Function
Local discontinuity of the material on the surface of bearing raceways or rolling
elements produces a series of impulses in vibration signals which can be modulated with
the bearing rotation and superimposed onto a random background vibration. Due to
damping in the bearing material and fluids, impulse signals quickly decay in time until
next impulse is generated. Patterns exist that can be associated with the location and
severity of the fatigue spalls. For instance, on-set defects tend to generate clean and
spikier impulses. Frequencies of these impulses could help identifying the location using
characteristic frequency calculations introduced in chapter 2.
The amplitude characteristics of a vibration signal X(t) (assumed to be a stationary
random process) can be expressed in terms of a probability density hc t ion (PDF) m e r ,
et al., 1978; Alfiedson, et al., 1985b; Bannister, 1985; Mathew, 19891. This is estimated
by determining the time duration for which a signal remains in a set of amplitude
windows.
A t i P ( X S x(t)s x + ~ ) = C -
r-I T
42
Where A t i is the time duration of the vibration signal X(t) falling into the amplitude
window hx . T is the total time duration of the vibration signal.
The above equation for all x with &small, results in an estimate of probability
function for X(t) (at selected life times) shown in Figure 3.2. The PDF of a good bearing
and a defective bearing are represented by the solid line and the dashed line respectively.
It can be observed that a good bearing with random vibrations has a Gaussian
distribution, while changes in the distribution curve, particularly at the lower values of
the PDF, indicate early stages of bearing failure. Note that a logarithmic scale was used
for the vertical axis to highIight the behavior at the extreme limits of the distributions,
such as the changes at low probability which have been found important in detection of
bearing damage. The horizontal axis is the acceleration of the vibration signal normalized
to the standard deviation.
Probability density curve derived from machinery vibration signals can be used in
monitoring machine conditions. It has been shown that the normalized PDF of the
vibration signal does not vary with load aud speed but changes as the condition of a
bearing deteriorates [Li, et al., 19921. With advancing damage the tails of the PDF
initially broaden. The high levels of probability density at the median and the large
spread at low probabilities, are characteristics of highly impulsive time domain
waveforms [Mathew, 19891. It is possible to quantify the variations in the skia of the
probability distribution by taking statistical moments which will be discussed in section
3.2.3. However, when the pitting and subsequent spalling has spread over most of the
working surfaces of the rolling element bearings, the probability density returns to the
basic Gaussian form once again bi, et al., 19921.
43
3.2.2 Root Mean Square and Peak Value
Root Mean Square (RMS) is often used to indicate the energy level of vibrations.
Peak designates the maximum amplitude of vibratious. They are defined as:
Peak = ~(max[x ( t ) )
where x(t) is the random vibration signal, p(x) is the amplitude probability density
k c t i o n of x(t) and E represents the expected value.
RMS is a simple measure of the effective energy or power content of the vibration.
It can be used to indicate deterioration of the bearing conditions. The incipient damage
can be detected by changes in peak values pustafbson, et al., 1962; Tandon, 19941.
Gustafsson et al. assessed bearing condition by a comparison of peak counts for the
measured signal and for a signal with a Gaussian amplitude distribution.
At the early stage of bearing damage when the impact signals are just evident,
discrete signals occur but leave the total vibrational energy relatively unchanged.
Therefore the RMS of the signal remaim virtually unchanged while an increase occurs in
the peak value [Dyer, et al., 1978; Bannister, 19851. The RMS value increases due to the
presence of more peaks fkom a more severe damage, but without necessarily increasing
the level of the peak value. Eventually as the damage becomes more advanced, both the
RMS and the Peak values increase s t m y . Therefore, combination of RMS and Peak
value could well indicate the severity of bearing defects.
Although RMS and Peak values can be applied to reflect the energy level of the
vibration, they cannot be used for single snapshot detection of bearing damage as the
expected values generally exhibit wide range depending on the operating conditions such
44
as load and speed and the testing environment. Unless the measured values of RMS and
Peak can be compared with the baseline values for a system under the same operating
conditions, they cannot be used effectively. For railway applications it is particularly
difficult to determine the baseline of the beating fiom a single measurement since no
particular information about the operating conditions of the passing freight car is
available.
Sun, et al, proposed to use normalized values of Peak and RMS to take into
account the operational condition and non-defect induced vibrations:
where RMSo is considered as the reference value for an undamaged bearing. There are
different ways of obtaining this value depending on the specific application. For bearing
used in fixed machinery, RMSo could be the value taken when the bearings are in good
condition and under ordinary operating conditions. For railway bearing condition
monitoring, RMSo could be the average of the RMS values of all the signals taken fiom
all the bearings passing by the sensor.
3.2.3 Statistical Parameters
Time domain statistical parameters have been used as one-off and trend parameters
in an attempt to detect the presence of incipient bearing damage. The commonly used
non-dimensional vibration amplitude parameters are the Crest factor (Cf), Kurtosis
value(Kv), Impulse factor (If) and Clearance fhctor (Clf) ~oward , 19941. These
45
parameters are derived from the amplitude probability density function of vibration
signals from the test object [Li, et al., 19921. They are defined as:
where x(t) is the amplitude of the vibration andp(x) is the PDF of x(f).
The Crest factor, which is the ratio of Peak and RMS values, is reported to be
effective in indicating the spikiness of the vibration amplitude. It is relatively insensitive
to changes in bearing speed and load [Akedson, et al., 19851. It pennits a direct
assessment of bearing conditions with minimal knowledge of its history. Crest factor is
partially effective in indicating bearing on-set defects as they tend to cause sharp
impulses in the vibration signals. Therefore abrupt increase in Crest factor value can be
observed. As the number of impulses per cycle increases with more extensive damage,
46
the RMS value increases while the Peak value remains unchanged. The net effect is that
the Crest factor will decrease.
A series of statistical moments can be used to indicate the shape of the probability
density distribution Flathew, 19871. These can be defined by the following integral
IPyer, et al., 19781:
where n represents the order of the statistical moment, m is the maximum order under
consideration. The first and second moments are known as the mean value and the
standard deviation respectively. These are analogous to the first and second area
moments of inertia with the area shape defined by the probability density hction. The
third moment is Skewness and the fourth moment is Kurtosis. For practical signais the
odd order moments are usually close to zero, indicating a symmetrical amplitude
distribution, whereas the higher even order moments are sensitive to the impulsiveness in
the signal.
The fourth moment Kurtosis value has been selected as a feature parameter in this
work since it is a compromise measure between the insensitive lower moments and the
over-sensitive higher moments. It has a value of 3.0 for an undamaged bearingy indicating
a Gaussian probability density function. A value of as high as 6.0 can be used to s i m
incipient bearing damage, indicating that the skirt of the PDF has changed appreciably
and is no longer Gaussian.
Similar to the Crest factor, the advantage of employing Kurtosis is that it is robust
to bearing operating conditions. This parameter is reported to be sensitive to failure in
47
rolling element bearings [Dyer, et al., 19781. In this case, cracked raceways and rolling
elements can cause large impulses in the time domain waveform.
Both Crest factor and Kurtosis value are independent of the actual magnitude of the
vibration but respond more to the spikiness of the vibration signal. These two parameters
produce values of approximately 3.0 which indicate that the waveform is generally
random in nature when the bearing is in good condition. They will increase dramatically
when fatigue spa11 is introduced. However, progression of the damage does not make
them increase continuously.
For instance, in the case of inner race defect, the circumferential extent of the crack
is much less than the distances between two rollers at the early stage when fatigue spa11 is
just developed. The system is excited by discrete shock loads. As damage propagates and
spreads around the periphery of the loading region of the bearing, vibration pattem
becomes more random. The impulsive content in the waveform gradually decreases and
the vibration signal appears to be continuous. Crest factor and Kurtosis value will then
reduce to normal in all frequency ranges. Therefore, these two parameters alone cannot
provide direct information on the severity of the bearing damage. Noting that significant
increase of overall vibration energy often accompanies more severe damage, a
combination of these two parameters and RMS value should be applied simultaneously.
This motivates us to develop the pattem recognition technique.
Impulse and Clearance factors have similar effects as Crest factor and Kurtosis
value [Alguhdigue, et al., 19931. Li and Pickering Ki, et al., 19921 showed that the
Impulse factor (If), Crest factor (Cf), Kurtosis value (Kv) and Clearance factor (Clt) are
all sensitive to early fatigue spalling. Kurtosis value is the most sensitive parameter yet
the least robust, while Clf is most robust to the operating conditions. It shows that these
parameters are sensitive to early spalling but may lead to inconsistent results if used in
isolation.
48
Two approaches can be used to calculate the time domain parameters. The first is to
calculate them for the entire fkquency rauge of the digitized signal, and the second is to
break the signal into various fiequency bands and obtain the parameters for each band. A
number of frequency bands can be chosen for computation and trending of the statistical
parameters. Khan w, 19911 recommended the use of at least two fiequency bands.
One is in the base band dominated by the defect frequencies, low order harmonics and
sidebands (below 5kHt). The other is in the pass band where it will be dominated by
structural resonance of the system (5 - 40kHz). In this work, the aforementioned six time domain parameters, RMS, Peak value,
Crest factor, Kurtosis value, Impulse factor and Clearance factor, are calculated using
two frequency bands: 1) the base band (0-1500Hz) which contains all the bearing
characteristic hquencies at various running speeds; and 2) the pass band (2-lo&)
where vibration signals will be dominated by the structwal resonance. Only one of the
two sets of the six parameters will be retained for constructing the feature space. Either of
the two sets can be used if they have similar values, otherwise the set which deviates
fiom the normal values is used. The time domain parameters computed for the vibration
signals shown in Figure 3.1 are listed in Table 3.1 a. Significant changes can be observed
when fatigue spall occurs on the outer raceway. Table 3.lb shows the time domain
parameters calculated for the same bearing running at different speeds. We can see that
their values also change dramatically at different speeds due to the occurrence of the
defect. It can also be noted that the RMS values (Rv) of the good bearing at different
speeds are all normalized to 1 to provide the references for defective bearings.
3.3 Frequency Domain Parameters
Time domain parameters are useful in detecting defects but they cannot directly
used to indicate the defect's location, e.g. inner race, outer race or the roller.
49
Conventionally, diagnosis of bearing defats and detecting the presence of periodical
components in a signal have been done by anaiyzing the spectrum of the signal. Figure
3.3(a, b) show the fiequency spectra of the vibration signals as shown in Figure 3.l(a, b).
Distinctive peaks can be observed in the fiequency spectrum of the defective bearing
compared to that of the good bearing due to defect-induced impulses.
Contact stresses at the interface between the rollers and the raceways are relatively
high. Abrupt changes in the stress caused by the passage of defects result in impulsive
excitations to the structure. This impulsive force may excite resonance in the bearing and
the housing structure. The excitation decays quickly due to damping of the structure.
Passage on the fatigue spall produces a series of damped oscillations with the time
intervals between the two consecutive peaks corresponding to the time between the
passing of the fatigue spall WcFadden, 19901. The cage frequency&, and ball passing
fiequencies for defects located on the inner race, outer race and rolling elements, denoted
as~p3,fbPI0,fbs/ respectively, can be determined from the bearing kinematics. As discussed
in chapter 2, they are functions of bearing geometry and shaft rotating speed. It is now a
well-established fact that impulsive vibrations can be observed on bearings with fatigue
spall. Therefore, the onset of failure can be detected by noting a predominance at o w of
the passage frequencies such or& Pannister, 19851.
Although the defect characteristic fkequencies could be used to help determine the
location of the defect, automatic detection of impulses at these frequencies is not a simple
task. This is because frequency spectra often show much stronger peaks at much higher
fiequencies representing high order structural resonance compared to those at the
characteristic fkequencies. Vibration energy of the bearing spreads across a wide
fkequeny band and can be easily buried in the noise.
A frequency index (Fi) is proposed to highlight significant fraquency contents that
may be associated with the bearing defat characteristics frequencies [Sun, et al., 19981.
50
It is computed by the ratio of firequency in the base band with the maximum magnitude
f,, with respect to the bearing rotation speed fala:
The base band here is defined to contain all the bearing characteristic frequencies at
various running speeds. In the case of random vibration, the frequency index varies in
wide range as shown in Table 3.2a This implies that Fi presents no indication on non-
defect related vibrations. In contrast, a defect-induced vibration signal will give a
consistent index as shown in Table 3.2b.
3.4 Segmentation Analysis and Parameters
3.4.1 Segmentation Analysis
Earlier work on pattern recognition for bearing diagnostics using time domain and
frequency domain parameters showed promising results [Sun, et al., 1997, 19981. We
. intend to further improve the sensitivity and reliability of the technique by introducing
segmentation parameters. Since the vibration signal of a bearing with defects is generally
non-stationary, segmentation analysis can be applied to characterize such a signal through
segmenting the signal into quasi-stationary components based on the understanding of
bearing dynamics.
Theoretically, due to the clearance between the mating surfaces in bearing assembly,
impulses can only be generated h m the passage of defects when they are inside the load
zone. In the load zone, contact stresses at the interface between the rolling elements and
the raceways are relatively high. Abrupt changes in the stress caused by the passage of
defects generate impulses in the vibration signal. Therefore, impulses modulated with the
shaft or cage fkquency can be detected.
51
Figure 3.4a shows a typical vibration signal recorded from a bearing with inner
race defects. It is obvious that impulses are grouped at the fkequency of the shaft rotation,
and therefore the inner race rotation. If we W e r inspect details of the signal within one
inner race rotation (Figure 3.4b), sharp impacts occurring at the ball passing inner race
frequency can be observed. Furthermore, these impulses only exist in approximately one
third of a complete inner race rotation, which correspond to the period when defects pass
through the load zone of the bearing. This observation contimed our explanation on the
bearing dynamics response.
A schematic of a bearing under radial loads is shown in Figure 3.5, where
raceways, cage and rolling elements are identified. It is also shown in the figure the load
zone with distributed contact stresses over an angular range of about 120°. Defects
located on different bearing components will generate impacts with different frequencies
and modulation patterns when passing through the load zone. Correlation exists between
the location of the defects and the impulse patterns observed in the vibration signal.
Parametric descriptions of various. impulse patterns are possible through segmentation
analysis.
Segmentation of a non-stationary signal can be performed in two ways: fixed
segmentation and adaptive segmentation. A fixed-length segmentation scheme is used in
this work in order to reduce the computational expense of the process. An important
consideration in the fixed segmentation process is the selection of the segment length,
which should reflect both the accuracy and the efficiency. In adaptive segmentation, the
segment length can be decided according to the statistical measure of the signal dynamics
at different time period. The adaptive segmentation may lead to minimum number of
segments however at the expense of increased computational burden of the process. The
autocorrelation function method menstein, et al., 1977; Michael, et al., 19791 uses
52
values of the short-time autocorrelation fbnction to determine the boundaries between
different segments.
To investigate and quantify the impulse patterns of the vibration signal with defects
located at different bearing components, following cases are studied:
Case 1: Inner Race Defect
For bearing with inner race defefts, the distribution of load around the bearing is
often non-uniform as the inner race of the bearing rotates. This is typified by a bearing
under radial load, in which the load around some part of the bearing may be small or, in
the case of a bearing with clearance, even zero. Such a load distribution usually covers a
range of about 120 degrees. The magnitude of the impact produced when a rolling
element strikes a spall will clearly depend on the location of the spall inside the load
zone. As inner race rotates, the magnitude of the defect-induced impacts will vary
periodically with the shaft rotation frequency.
The vibration signal from a bearing with inner race defects is shown in Figure 3.4a.
Obvious variations of the vibration signals in relating to defects inside or outside of the
loading zone can be seen in one revolution corresponding to the rotation of the shaft. If
we divide the signal in one shaft revolution into six segments, at least one segment will
be completely inside the load zone and one completely outside of the load zone (Figure
3.4b). Descriptive features of these segments can be calculated through time domain
parameters, as indicated in Table 3.3a Compared to the values of the parameters for the
bearing in good condition (Table 3.4a), obvious variations of the time domain parameters
among the six segments can be noted, which indicate the existence of defect-induced
impulses. In Table 3.3% most of the values in the last two columns representing the last
two segments in Figure 3.4b are much larger than those in other columns. To capture the
variations among the segments, we decide to calculate the standard deviation of these
53
time domain parameters among six segments and use them as the segmentation
parameters. Table 3.3b shows the values of segmentation parameters for the same bearing
running at different train speeds. Consistent indication can be observed. The low values
of segmentation parameters for the bearing in good condition (Table 3.4b) further
confirm that segmentation analysis is effective in characterizing defect-induced vibration
patterns.
Case 2: Roller Defect
The vibration pattern produced for a roller defect is in some respects similar to that
produced by a bearing with inner race defects. Rollers rotate around the center of the
shaft with the cage at the same time as they spin around their own center axles. When
there is a defect on the roller, it encounters the inner and outer raceways alternately.
Therefore, the relative angular fkquency between the defective roller and the loading is
the cage fkequency. The impact generated &om defects on the roller should be at the
frequency of twice of the ball spin frequency. Figure 3.6a shows the vibration signal
obtained from a bearing with roller defects. Impulses are modulated with the cage
rotation. Figure 3.6b shows the segmentation referenced in one cage revolution. Time
domain parameters are subsequently computed and listed in Table 3.5a It can be
observed that the Peak value and RMS of segment 2 and 3 are higher than those of the
other segments. Table 3.5b shows the segmentation parameters obtained £?om a bearing
with roller defect.
Case 3: Outer Race Defect
Figure 3.7 shows the vibration signal taken h m a bearing with defects on the
outer race. Since the outer race is normally fixed with the bearing housing, defects should
always be initiated and remain inside the load zone. Therefore, defect-induced impulses
54
should not be modulated with either the shaft or the cage rotations. No obvious variation
associated with the shaft or cage rotations can be observed. Still, impacts occurring
periodically with the ball passing outex race characteristic frequencies should be seen
from the signal. Such feature should be captured by the Frequency index as introduced in
our previous work [Sun, et al., 19981. Signals with outer race defects are shown in Figure
3.7a and 3 3 .
Quantitatively, Table 3.6a shows the time domain parameters calculated for the
six signal segments in a single cage revolution. Although no obvious variations of these
parameters exist among different segments, energy level indicated by Rv and Pk for all
segments must be higher for defective bearings. As well, since standard deviation
indicates the absolute difference between the real and the mean values of the signal, it
leads to higher values of m -. and mk -, even though the relative difference is much
less than that in Case 1. Segmentation parameters referenced in the cage frequency of a
bearing with outer race defect at different train speeds are illustrated in Table 3.6b.
3.4.2 Feature Extraction Using Segmentation Parameters
For training and diagnosis purpose, we need to apply the segmentation parameters
mentioned above referenced in both shaft and cage rotations. This is because in
diagnosis, location of the defect is to be determined and therefore prediction on possible
shaft or cage modulations could not be made in advance.
In summary, the feature space is now composed of 19 dimensions. Among them,
seven parameters are described from our previous work [Sun, et al., 1998, 19991 and six
are from each of the two segmentation schemes: one in cage rotation and one in shaft
rotation.
A point x in the feature space can be denoted as:
Each dimension piays its distinct role in representing characteristics of bearing defects of
different kinds and at different severity levels.
-200 -
-=o 6, o; 03 o i - . . 0:s 0's 67 oh 04 1
Ttme (Seconds)
(b)
-10
-15
Figure 3.1. Bearing vibration signals (a). Bearing in good conditions (b). Bearing with defects
- -
- -
6 0 0:2 Of3 0:4 0k 0:. 0:7 0:. OIO 1
Time (Seconds)
Figure 33. Frequency spectra of the vibration signals (a) Bearkg in good conditions (b) Bearing with defects
0:4 d2 o 0 5 0 0:7 8 04 ! Time (seconds)
Figure 3.4.: Vibration signal for inner race defect in one second
0 . h 0.b A;# 0.k 0:l 0.12 0 .4 0 . ; ~ 0.!8
Time (seconds)
Figure 3.4b. Vibration signal for inner race defect in one shaft revolution
-1001 1 I I P 1 I I I I I
0 0.1 0.2 0.3 0.4 0.5 0-6 0.7 0.8 0.9 1
Time (seconds)
Figure 3.6a. Vibration signal fiom bearing with roller defect
0.k 0:1 0 0'2 ah 0:3 0.k 0; 0.b O!S
Time (seconds)
Figure 3.6b. Signal from bearing with roller defect over one cage revolution
Time (seconds)
Figure 3.7a. Vibration signal for outer race defect over one second
Figure 3.7b. Vibration signal for outer race defect over one cage revolution
Table 3.la. Comparison of time domain parameters for good bearing and defective
bearing
Table 3.lb. Time domain parameters for bearings at different speeds
t Time Domain Parameten : Bmhginpood#ndbn 1
i ~ c a i n p w t h ~ f e
RJYIS 1.1236 Jl.8259
! Train meed i (mph)
Rv i f Pk
Cf Kv
Peak 4 . S
249.6221
Defective bearing
Clf
Good beating 70
17.4763 122.7036 7.0212 10.1712
30 9.9814
1 10.6424 11.0869 M.9225
'kbr 5.6455
. 11.3086
C l s r t W 4.4114 8.0978
70 1
3.9096 3.- 2.9932
30 1
4.447 4.447 3.275
SO 13378 105.0635 7.- 12.7381
50 7
1 3.9743 3.9743 3.0757
K u W s 3.4401 6.76
0.1127 4.995
0.1287 5.652
CbmmsWr 0.2W 0.5389
0.1129 4.9948
0.2762 11.1683
0.4399 18.1752
0.3274 13.006
Table 3.2a. Frequency index for bearing in good condition
Table 32b. Frequency index for bearing with cup spalls
Table 33a. Time domain parametem of six segments for bearing with inner race defect
Table 33b. Segmentation parameters (shaft fkquency) for bearing with Inner race defect
! Time Domain I Parametem
fi I
: Pk cf a
Kv Clf
:
; If -
- ..... Segment Index
f Segmentation I ~ a r a r n e t . ~ a 08.J r i QM, s
t W - 3 i- 1 m-r : W -3
W-J
1 7.8491 22.9907 2.9301 3.4406 6 . s 3.7442
Train Running Speed (mph)
2 4.9825 12.5721 2.5233 3.1313 5.6337 3.2916
30 8.8932 38.2407 1.2548 4.991 3 3.3334 2.3333
3 6.0294 15.125 2 . 2.5077 5.31 18 3.1451
40 11.8269 38.!5805 0.7543 1 7959 2.381 1 1 264
4 10.3168 31.183 3.0225 3.5191 6.9424 3.9892
80 80.5279 168.7093 0.4965 1.5312 4.1686 0.945
SO 43.3719 154.0913 0.6159 1.1 19 1.9795 0.7626
5 20.9734 82.7235 3.9442 6.0414 10.7427 5.8321
6 29.7816 106.134 3.5637 5.7815 8.4341 4.8886
60 50.9753 173.0131 0.7551 2.4848
. 4.3528 1 -4364
70 95.7304 156.1674 0.6564 1 -6738 4.2932 1 -2240
Table 3.4.. Time domain parameters of six segments for bearing in good condition
Tabk 3Ab. Segmentation parameters (sh& frequency) for bearing in good condition
f Time Domain i Parameters i ; Av 2
i Pk Cj Kv
I CU i
Segment Index a
j Seqmentation Parameten
m-J t-
m-3 L
a C l - 3 B
i m-3
i W -1 b ? : W-1
1 1.2348 3.8438 3.1129 3.6318 7.814 4.24
Train Running Speed (mph) -
2 0.8953 2.6677 2.9797 3.2602 6.7993 3.8483
30 0.0815 0.4533 0.394 0.3326 0.298 0.5025
3 0.9193 2.8486 3.0967 3.4334 7.258
4 . m
40 0.1 129 0.4612 0.2658 0 . W 0.3335 0.397
4 1.069
3.5156 3.2887 3.4150 8.5502 4.5622 -
60 0.3689 1.0771 0.2522 0.31 56 0.3268 0 . m
50 0.1845 0.3457 0.3022 0.2864 0.5872 0.4742
5 1.4095 3.2048 2.9832 3.5043 7.0656 3.9461
70 0.1698 1 .0661 0.253 0.3481 0.S4 0 . B
6 1.1504 3.1873 2.7705 3.0433 6.2527 3.5487
80 0.3229 1 -3224 0.1507 0.2309 0.4343 0.2342
Table 3.5a. Time domain parameters of six segments for bearing with roller defect
Table 3e5bm Segmentation parameters (cage fkquency) for bearing with roller defect
i Time Domain f Parametem i & f
pk i Cf I Kv I Clf
Segment Index
:
I Segmentation f i Parameten i a& -c
f m - 6
f : ..... W - 6
-6
i m - 6
i W - 6 .
1 15.4603 52.7938 3.4148 3.4407 2.5953 4 . 4 s
Train Running Speed (mphl
2 21.138 80.3135 3.7995, 3.5667 2.8607 4.8963
30 2.5264 12.813 0.31 01 0.2841 0.2729 0.438
70 4.148
13.0024 0.129 0.1679 0.3152 0.206
3 20.0997 67.957 3.381 3.297 2.5248 4.3404
80 4.6758 12.7737 0.2257 0.3624 0.65~2 0.3535
4 16.884
58-8185 3.4837 3.2428 2.5376 4.4146
5 18.1045 !54.9398 3.0346 3.076 2.2125 3.8412
60 3.1617 8.9394
40 1.7198 6.6802
6 14.7843 43.6795 2.9544 2.7636 2.1091 3.6936
50 1.8943 0.0384
0.2983 0.1 24
0.2921 0.3784
0.2534 0.2015 0.3684 0.3553
0.1 129 0 2299 0.286 0.194
Table 3.6a. Time domain parameters of six segments for bearing with outer race defect
, Time Domain f Parametem
Rv Pk
I I Cf I Kv I
i Cr r f
Table 3.6b. Segmentation parameters (cage frequency) for bearing with outer race defect
i f Segmentation j Parameten i m~ -6 I
--6 I i t W-6
i i - -c i QCEl-c i i : W -6
Segment Indew 1
86.5941 Z.2959
2.7288 3.- 2.6387 3.4454
Train Running Speed {mph)
2 88-7159 249.6868 2.8145 2-5748 2.5762 3.428
80 17.21 57
70 41 . a 1
30 6.4294
3 93.7776
276.2982 2.9463 3.1394 2.6562 3.7322
22.7424 0.1 149 0.2668 0.1 152 0.1843
40 3.6!519
4 92.0558
201.2072 3.0547 3.2082 2.9412 3.8514
1 12.3491 0.1529 0.1761 0.391 3 0.2366
22.9486 0.2786 0.5665 0.3587 0.4306
50 8.7423
37.4694 0.1 232 0.1 546 0.3358 0.195
5 105.5642 270.3251 2.5608 2.5432 2.4246 3.1616
60 14.5054
6 103-6473 282.8041 2.7285 2.8146 2.6287 3.4277
76.61 23 0.3423 0.4154 0.5006 0.4872
44.1 272 0.1275 0.1734 0.21 73 O.lf38
CHAPTER FOUR:
NEURAL NETWORKS FOR NONLINEAR MAPPING
4.1 Introduction
After constructing the feature space, we propose to map the data £?om the 19-
dimensional feature space to a 2-dimensional classification space since an image in a 2D
space can be easily visualized and analyzed by a human observer, which greatly assists
the design of an effective pattern classifier.
In the earlier work [Sun, et al., 19981, linear mapping was performed to project
samples in the feature space to the classification space. In other words, elements in the
classification space are assumed a weighted linear combination of those in the feature
space. The weights are determined through the data sets with known bearing conditions.
The least squared criterion is used to create cluster effects on data belonged to the same
class. However, the linear mapping, although simple in computation, does not necessarily
guarantee that classes are sepmtable by linear boundaries in the classification space.
Pattern belonging to different classes may overlap in the classification space. The results
suggest that a nonlinear mapping between the two spaces may be desirable.
In this research, a three-layered artificial neural network is introduced to accomplish
the nonlinear mapping fiom the feature space to the classification space. Three-layered
networks are sufficient for representing the non-linear relations between the input and
output and they have relatively simple architecture. One advantage of the artificial neural
network approach is that it allows us to construct complicated non-linear relations
between input and output when deficient analytical description is available. Although
neural networks are computationally intensive, most of the intense computation takes
place during the training process which can be conducted off-line. Once the network is
70
trained for a particular task, operation is relatively fast and unknown samples can be
rapidly identified in the field. They have the ability to recognize relations between the two
sets of data even when the information comprising these data is noisy or incomplete.
[Alguindigue, et al., 1 993; Unal, 1 994; Subrahmanyam, et al., 1 9971.
Assume a total number of N samples are used for training. Each sample belongs to
one class of a total of K classes with K cluster centers uh k = 1, . . . , K. The nonlinear
mapping between two spaces can be expressed as:
where X E R ' ~ is a vector in the feature space denoted as [RW pk C/ KV cr/ ~f ~i OR" -=
mk - C o ty -c m y - = tzy-c b ~ v - , UP^ -.r QT.J-+ b ~ v - . ~ m : ~ -.r bl/ -.r lT, while
y ~ ~ 2 represents a vector containing the corresponding coordinates in the classification
space denoted as [r, When artificial neural networks are to be trained to learn the
non-linear relations, known values of x and y are used as input and output respectively.
The purpose of mapping between the feature and classification space is for dimension
reduction while creating the best clustering effect for the N samples belonging to the
same class around their own specified cluster center u'. Successful mapping allows
application of the simple piecewise linear boundaries that will be discussed in chapter 5.
4.2 Artificial Neural Networks
Artificial Neural networks (ANN) can be implemented as computer algorithms that
can be used to describe a system in terms of relations between input and output. They
represent an alternative method of describing systems when it is very difEcult or
impossible to use analytical approaches. They have been used in a wide variety of
applications related to manufacturing. These applications include process control, quality
71
con~ol, industrial inspection, optimization, and modeling [Naumann, 1990; Keller, et al.,
1994; Peck, et ai., 1 994; Zhu, et al., 19951.
An artificial neural network in its basic form is composed of several layers of
neurons; an input layer, one or more hidden layers and an output layer (Figure 4.1).
Output of each layer becomes the input to the next layer. The first layer is an input layer
that distributes the inputs to the hidden layer. Figure 4.1 shows the architecture of a
multiplayer neural network with two hidden layers. To set the stage in its general form,
the network shown here is fully connected, which means that a neuron in any layer of the
network is connected to all the neurons or nodes in the previous layer. Signal flow
through the network in a forward direction on a layer-by-layer basis.
A neuron is an information processing unit that is fundamental to the operation of a
neural network. Figure 4.2 shows the model for a neuron. Three basic elements of the
neuron model are described as follows:
1 ) A set of synapses or connecting links, each of which is characterized by a
weight or strength of its own. Specifically, a signal xi at the input of synapse
i connected to neuron j is multiplied by the synaptic weight wji. It is
noteworthy that the first subscript of wji refers to the neuron in question and
the second subscript refers to the input end of the synapse.
2) An adder for summing the input signals, weighted by the respective
synapses of the neuron. The operations involved constitute a linear
combiner.
3 An octivationfinction for limiting the amplitude of the output of a neuron.
Except at the input layer, every neuron has an activation value that is a
function of the weighted sum of input signals. The activation function limits
the permissible amplitude range of the output signal to some finite value.
Typically, the normalized amplitude range of the output of a neuron is
written as the closed unit interval [O, 1 ] or alternatively [- 1,1].
In mathematical terms, we may describe a neuron j by writing the following pair of
equations: I
and
where XI, x2 , . . ., XI are the input signals; w,~. wjz, . . ., W ~ I are the synaptic weights
of neuron j; u, is the linear combiner output; 6) is the threshold; f (-) is the activation
hction; and y, is the output signal of the neuron. The internal activity level v, is the
linear combiner output u, modified with the threshold 6) :
vj = uj - 6)
The threshold t?J is considered to be zero in the subjected network. Therefore, we
may formulate the combination of Eq. (4.2) and (4.3) as follows:
I-I
and s = f (w)
The activation hction, denoted by f () , defines the output of a neuron in terms of
the internal activity level at its input. Three basic types of activation functions may be
identified:
73
1) Threshold Function
In this model, the output of a neuron takes on the value of 1 if the total internal
activity is nonnegative and 0 otherwise.
2) Piecewise-Linear Function
This form of activation function may be viewed as an approximation to a non-linear
amplifier. The amplification faftor inside the linear region of operation is assumed to be
unity.
3) Sigmoid Function
The sigmoid function is by far the most common form of activation function used in
the construction of neural networks. It is defined as a strictly &creasing function that
exhibits smoothness and asymptotic properties. An example of the sigmoid is the Logistic
function, defined by
A sigmoid function assumes a continuous range of values &om 0 to 1 or -1 to 1 and
is differentiable as shown in Figure 4.3.
4.2.1 Multilayer Fe&orward Artifkial Neural Network
Multilayer feedforward artificial neural networks have been widely adopted for
many ANN applications. They have been applied successfidly to solve complicated
problems by training them in a supervised manner with the popular algorithm known as
the error back-propagation algorithm maykin, 19941. The basic idea of back-propagation
was first described by Werbos in his Ph.D. thesis [Werbos, 19741, in the context of
general networks with neural networks representing a special case. The development of
the back-propagation algorithm represents a "landmark" in neural networks in that it
74
provides a computationally efficient method for the training of multiplayer perceptrom.
The trained network based on the error back-propagation algorithm often produces
surprising results in applications where explicit derivation of input-output relationship is
almost impossible.
Training of feedforward neural networks takes place in an iterative fashion. Each
iteration cycle involves a forward-propagation pass followed by an error backward-
propagation pass to update the connection weights. Figure 4.4 depicts a portion of the
multiplayer neural network with the two passes.
The forward-propagation pass starts when the input nodes receive their activation
levels in the form of an input pattern. Then, forward-propagation proceeds through the
hidden layers up to the output layer by computing the activation levels of the nodes in
those layers. Finally, a set of outputs is produced as the actual response of the network.
During the forward pass the synaptic weights of the network are all fixed.
Weight adjustment is accomplished by propagating the error function of the output
back through the net and modifying all the weights. The iterative method propagates error
hc t ion required to adapt weights back &om nodes in the output layer to nodes in the
hidden layers in accordance with the training rule. The weights are adjusted so as to make
the actual response of the network move closer to the desired response.
Training sets are repeatedly presented and weights modified until the error between
the predicted and actual output is less than a specified value (error criterion). Once the
neural network has been trained in this way, it should be possible to relate input patterns
with the appropriate output patterns [Chiou, et al., 1992). To use the trained ANN, a new
input set is simply presented to the network and the network calculates an output solution.
Properly-trained ANNs are able to give reasonable answers when presented with inputs
that they have never seen. Typically, a new input will lead to an output with similar
features to the comct output for input vectors with similar features used in training.
75
Therefore, it is possible to train a neural network on a representative set of input/target
pairs and get good results without training it on alf possible input/output pairs.
4.2.2 Error Back-Propagation Training Algorithm
Before the network could be used for the non-linear transforming purposes in this
work, we decide to apply the supervised learning technique to train the neural network
using a set of known inputs and corresponding outputs.
Inputs are the features extracted from bearing vibration signals with known bearing
conditions. Desired outputs are the cluster centers arbitrarily chosen for each class.
Assume there are total K classes, K cluster centers u,, k = 1, . . ., K, are chosen in the first
quadrant of a 2D coordinate h e in order to locate the desired output associated with
each of the K classes. Although their arrangement is somewhat arbitrary, we placed them
evenly on a unit circle in the first quadrant of a 2D space. Non-linear mapping is applied
to cluster the entire samples belonged to the same category in the feature space around
their own specified cluster center ar in the classification space.
Consider a three-layered ANN with only one hidden layer as shown in Figure 4.5. In
the figure, index i refers to nodes at the input layer, index j ~ f e r s to nodes at the hidden
layer, and index k refers to nodes at the output layer. widenotes the weight of the
connection between node i in the input layer and j in the hidden layer, while v~denotes
the weight of the connection between node j in the hidden layer and k in the output layer.
Assume xi, i = 1, . . ., I, are input signals, a neuron j in the hidden layer can be described by
writing the following pair of equations:
and
where 6, represents the internal activity level of the neuron j, y, is the output of the neuron
j and f () is the activation hmction of the hidden layer.
Similarly, a neuron k in the output layer can be described by writing the following
pair of equations:
and
where c, represents the internal activity level of the neuron k, or is the output of the
neuron k and fc) is the activation hc t ion of the output layer which is assumed to be the
same as the hidden layer.
Let p be the index representing the training set and P the total number of samples
involved in training the network. At any iteration, the sum of squared errors for the pth
training sample between the target and actual output is defined as:
Where:
K is the number of output nodes of the network.
O k d represents the target output at node k.
0, is the actual output at node k.
The average squared error Em among all the training sets can be calculated as
Obviously, the value of the error fiulction depends on the weights of the network.
For a giving training set, Em represents the performance function as the measure of
training set learning performance. The objective of the learning process is to minimize the
performance function Em through adjusting the weights at every neuron. We consider a
simple method of tmhing in which the weights are updated on a sample-by-sample basis.
The adjustments to the weights are made in accordance with the respective errors
computed for each sample presented to the network. The arithmetic average of these
individual weight changes over the training set is therefore an estimate of the true change
that would result from modifying the weights based on minimizing the performance
function E, over the entire training set. The gradients calculated at each training pattern
are added together to determine the change in the weights. It can be seen that the
performance function depicts the accuracy of the neural network mapping after a number
of training cycles have been implemented.
The gradients of the error surface with respect to the weights between the output and
hidden layers d E / h h is calculated as follows:
Substituting Eq. (4.10) into the above equation leads to:
In Equation (4.13), argument p is omitted from E for brevity. The gradient aE/&k,
determines the direction of search in weight space for the weight v,. Change in weights
between the output and hidden layers Avk is proportional to the gradient aE/&& :
where 7 is the Leaming rate of the back-propagation algorithm. At the nth iteration of the
training process, weights at every neurons in the output layer are updated using the
increment calculated in eq. (4.16):
Now we consider the weight adjustment from input Layer to the hidden layer, the
gradients of the error surface with respect to the weights between the hidden and input
layers dE/aWji is calculated as follows:
Combine Eq. (4.9) and (4.10) with the above equation., we can obtain:
The weight adjustment between the hidden and input layers Aw,~ is proportional to
the gradient dE/&vji :
At the nth iteration of the training process, weights at every neurons in the hidden layer
are updated using the increment calculated in Eq. (4.20):
Backpropagation networks often have one or more hidden layers with sigmoid
activation fimction followed by an output layer of linear or sigmoid bction. Multiple
layers of neurons with nonlinear activation fbctions allow the network to learn nonlinear
and linear relationships between input and output vectors. An example of a continuously
differentiable nonlinear activation hct ion commonly used in multilayer neural networks
is the sigrnoidal function which is smooth (i.e., differentiable everywhere). Based on
numerical experiments conducted in this work and what is available in the literature,
sigmoidal functions work best for supervised neural nets; i.e., the inputs and the
corresponding outputs are known a priori [Karkoub, et al., 19911. Moreover, the use of
80
the logistic hc t ion is biologically motivated, since it attempts to account for the
refkctory phase of real neurons pineda, 1988 3. The same sigrnoidal nonlinearity in the
form of a Logistic hc t ion is chosen for all the hidden and output neurons of the ANN
used in this work.
With the application of the sigmoid activation hc t ion as shown in Eq. (4.7),
derivatives of f (v) with respect to v can be derived. Combine with Eq. (4.1 I), we have
the following expression:
And combining with Eq. (4.9) results in:
Combine Eqs. (4.16), (4.1 7) and (4.22), weight adjustments at neurons in the output
layer can be expressed as follows
Weight adjustments at neurons in the hidden layer can be expressed after combining
Eqs. (4.20), (4.21), (4.22) and (4.23) as follows:
It is to be noted that if the network has more than one hidden layer, the same
procedure is extended to adjust the weights at all the additional hidden layers.
If we define the weight space to be
then the weight adjustments in the hidden and output layers can be expressed as:
where the change in the weight space A is defined as:
4 2 3 Convergence
The back-propagation algorithm is implemented by the method of gradient descent.
Typically, the effectiveness and convergence of the error back-propagation learning
algorithm depend significantly on the value of the learning rate constant q . In general,
however, the optimum value of 7 depends on the problem being solved, and there is no
single learning constant value that would be suitable for all cases. This problem seems to
be common for all gradient-based optbization schemes. While gradient descent can be
an efficient method for obtaining the weight values that minimize an error, error surfaces
frpquently possess properties that make the procedure slow to converge. The smaller we
make the learning rate parameter q , the slower but smoother the procedure will be
leading to optimal point in the weight space.
Although one can speed up the rate of learning by setting q to a large value, the
resulting large changes in the weights may result in unstable behavior (i.e., oscillatory).
Also, a smaller value of .q may be desirable when close to the target to avoid
82
overshooting the optimal point. A simple method of increasing the rate of learning and
yet avoiding instability is to modify the updating rule as shown in Eq. (4.27) by including
a momentum tenn.
Momentum can be added to error back-propagation learning algorithm by
including the search direction in the weight space at the previous iteration AW(~-'). This
is usually done according to:
where 0 < a < 1 is referred to as the momentum constant and the first term in Eq. (4.29)
is called the momentum term. When a = 0, search direction is in the gradient descent
direction and Eq. (4.29) is identical to Eq. (4.27). When a = 1, search direction is pamllel
to that of the previous iteration and the gradient is simply ignored. The weight
adjustments of the output layer according to the generalized updating rule is
For the hidden layer, it is
The incorporation of momentum in the back-propagation algorithm has highly
beneficial effects on learning behavior of the algorithm. The momentum term typically
helps to speed up convergence, and to achieve an efficient and more reliable learning
profile. Momentum allows a network to respond not only to the local gradient, but also to
the shape of the error d a c e . Acting like a low pass filter, momentum allows the
83
network learning to ignore small sudden changes on the error surface. It also helps to
avoid being trapped by the local minima.
4.2.4 Stopping Criteria
There are some typical termination criteria, each with its own practical merit, which
may be used to terminate the weight adjustments. Two commonly used criteria are
introduced in the following:
i) The maximum value of the average squared error Em is equaled to or less than a
sufficiently small threshold which is chosen as the criterion for convergence as stated
here:
E - w ' ) ~ E (4.32)
where W' is the weight vector which denotes a minimum, E is a sufficiently small
error threshold. The back-propagation computation iterates by presenting new epochs of
training samples to the network until the parameters of the network stabilize their values
and the average squared error Em computed over the entire training set is at the small
threshold. The drawback of this convergence criterion is that, when the shape of the error
space is flat the criterion can be reached without finding the minimum W' . ii) The absolute rate of change in the weight vectors per iteration is sufficiently small
as follows:
IwC) - ,,,,b-') 1 6 I (4.33)
The network learning has converged when the consecutive weight adjustments reaches
the small threshold. This convergence criterion has the drawback of the network being
trapped in the quasi minimum if the network converges to a value that is diverse with the
ideal minimum.
The first convergence criterion is utilized for the network learning in this work. The
experimental results show that it has been a very simple and effective stopping criterion.
4.2.5 Initial Weights and Cumulative Weight Adjustment
The weights of the network to be trained are typically initialized with random
values. If all weights start out with equal values, and if the solution requires unequal
weights, the network may not be trained properly. Also, the network may fail to learn
with the error increasing as the leaning continues. In fact, many empirical studies of the
algorithm point out that continuing training beyond a certain low-error plateau may result
in the undesirable drift of the weights. This causes the error to increase again afker being
converged previously. To counteract the drift problem, network learning should be
restarted with other random weights.
There are two schemes of updating the weights in the error back-propagation
learning. Scheme 1 (Figure 4.6) is called incremental updating which is based on the
single training sample error reduction and makes a small adjustment of weights which
follows each presentation of the training sample. Scheme 2 (Figure 4.7) is called batch
updating which implements the minhbt ion of the error fhction computed over the
complete cycle of P samples with gradient descent searching, provided the learning
constant q is sufficiently small.
The advantage of scheme 1 is that the searching for optimal solution is along the
gradient descent direction on the error surface. Moreover, during the computer
simulation, the weight adjustments determined by the algorithm do not need to be stored
and compounded over the learning cycle consisting of P joint error signal. However, the
network trained this way may be skewed toward the most recent training sample in the
cycle. To counteract such a problem, either a small learning constant 7 should be used or
cumulative weight changes be imposed as follows:
85
for both output and hidden layers, where A W ( ~ ) represents the change in the weight
space for the pth training pattern. The weight adjustment in this scheme is implemented at
the conclusion of the complete learning cycle. It takes the average effects of all the
training cycles. Provided that the learning rate is small enough, the cumulative weight
adjustment can still implement the algorithm close to the w e n t descent minhkation.
Although both scheme 1 and scheme 2 can bring satisfactory solutions, attention
should be paid to the fact that the training works the best under random conditions. It
would thus seem advisable to use the incremental weight updating after each pattern
presentation, but choose patterns in a random sequence. This introduces much-needed
noise into the training and alleviates the problems of averaging and skewed weights
which would tend to favor the most recent training patterns.
4.3 Experimental Determination of Optimal Neural Network
4 e 3 e 1 e Network Architecturm with Optimal Hidden Layer
The multilayered ANN trained with the back-propagation algorithm is applied to
perform the nonlinear input-output mapping. One of the most important attributes of a
multilayered neural network design is choosing the architecture. The number of input
nodes is simply determined by the dimension of the input vector.
In this thesis, the input-output relationship of the network defines a mapping from a
19-dimensional feature space to a Zaimensional classification space. Thus, the number of
input nodes is chosen to be nineteen and the number of neurons in the output layer is two.
This inputoutput mapping is assumed to be infinitely continuously differentiable. In
assessing the capability of the neural network, two fundamental questions arise:
I ) Determine the number of hidden layers:
It was Cybenko who demonstrated rigorously for the first time that a single hidden
layer is suilicient to uniformly approximate any continuous h c t i o n with support in a
86
unit hypercube [Cybenko, 19881. He introduced the universal approximation theorem
which states that a single hidden layer is d c i e n t for a multilayered neural network to
compute a uniform approximation to a giving training set represented by the set of inputs
and a desired (target) output. A single hidden layer is chosen for the ANN used in this
work.
2) Determine the size of the hiden layer:
Size of each hidden layer is mostly determined through trial and error process
depending on individual problems. The exact analysis of the issue is rather difFcult
because of the complexity of the network mapping and due to the non-deterministic
nature of the training procedures. If there are too few nodes the neural network will fail to
memorize the training process and lead to underfitting. Too many neurons can contribute
to overfitting, in which all training points are well fit, but the fitting curve takes wild
oscillations between these points. Based on trial and error, the size of 24 is found to be
the optimal compromise between underfitting and overfitting with faster convergence
compared to 20 and 26 hidden nodes (see Table 4.1). Therefore, the finally obtained
optimal architecture of the network is 19-24-2 as shown in Figure 4.8. It is shown that the
input layer has 19 nodes, each of which represents a parameter in the feature space
denotedas as [h fk Cf Kv Clf If Fi ~ ~ v - e a r k - c CRY-= m - c b t : l j - c mf-= ORV - s
m - . gty -I bkv - .* mu -.T my - S lr. The output layer has two neurons, each of which
contains a coordinate value in the 2D space denoted as 1% y.lT.
4.3.2. Accelerated Convergence through Learning-Rate Adaptation
Situation arises when a constant learning rate q does not produce satisfactory
performance. For example, on a flat e m r surface, too many steps may be required to
compensate for the small gradient value. In this work we use a heuristic technique to
87
determine the variable learning rate in order to accelerate the convergence of bck-
propagation learning. Four heuristics are considered as guidelines maykin, 19941 :
Heuristic 1. Every adjustable network parameter should have its own adjustable
learning-rate parameter. The back-propagation algorithm may be slow to converge
because of a fixed learning rate that may not suit a l l portions of the error suxface. In other
words, a learninggrate parameter appropriate for the adjustment of one weight is not
necessarily appropriate for the adjustment of other weights in the network. This method
recognizes this fact by assigning a different learning-rate parameter to each adjustable
weight @ m e t e r ) in the network.
Heuristic 2. Every learning-rate parameter should be allowed to vary fmm one
iteration to the next. Typically, the error surface behaves differently in different regions
and different dimensions. In order to match this variation, heuristic 2 allows the learning
rate to vary from iteration to iteration.
Heuristic 3. When the derivative of the performance fimction with respect to a
weight has the same algebraic sign for several consecutive iterations, the learning-rate
parameter for that particular weight should be increased. The current operating point in
the weight space may lie on a relatively flat portion of the error surface along a particular
weight dimension. This results in the derivative of the performance function with respect
to that weight with the same algebraic sign, that is, the same gradient direction, for
several consecutive iterations. Heuristic 3 states that in such a situation the number of
iterations required to move across the flat portion of the error surface may be reduced by
increasing the learning-rate parameter appropriately.
Heuristic 4. When a E / h n alternates for several consecutive iterations of the
algorithm, the learning-rate parameter for that weight should be decreased. This is the
opposite situation to the above. When the current operating point in weight space lies on a
portion of the error surface along a weight dimension of interest that exhibits peaks and
88
valleys (i.e., the surface is highly curved), then it is possible for the derivative of the
performance h c t i o n with respect to that weight to change its algebraic sign from one
iteration to the next. In order to prevent the weight adjustment from oscillating, the
learning-rate parameter for that particular weight should be decreased appropriately.
It should be noted that the use of a non-uniform and time-varying learning rate
modities the back-propagation algorithm significantly. Specifically, the modified
algorithm no longer performs a gradient descent search. Rather, the adjustments applied
to the weights are based on (1) the partial derivatives of the error surface with respect to
the weights, and (2) estimates of the curvatures of the error surface at the current
operating point in weight space along the various weight dimensions.
Let t l (n) denote the learning rate assigned to the weight at iteration n for both
hidden and output layers. The learning-rate update rule is defined as follows:
where 0 < y < 1 is a positive constant called the control step-size parameter for the
leaming rate adaptation procedure. The partial derivatives d ~ ' " ' / d r . r ( " ) and
E ~ ' / . " ' refer to the derivative of the error surface with respect to the weight
w./" at iterations n and n-1 respectively. It can be observed that when the partial
derivative has the same algebraic sign on two consecutive iterations, the adaptation
procedure increases the learning rate for the weight W.V. Correspondingly, the learning
along that direction will be fast. When the derivative alternates on two consecutive
iterations, the adaptation procedure decreases the leaming rate for the weight W.V.
Consequently, the learning along that direction will be slow.
Many parameters of the network can be adjusted during training to provide optimal
performance. Unfortunately, a systematic method for selection of the most appropriate
89
parameters does not exist. Thus construction of neural networks typically requires a trial
and error approach. Based on many trials, we determined the optimal settings of the
control parameters to be:
When the initial learning rate p has the value of 0.01, the learning takes twice as
much time; while when rp is 0.1, the convergence becomes unstable in some regions.
The momentum a of 0.9 accelerates the learning rate most, but only to the extent that the
network can learn without the increase of the error function. If the control step-size
parameter y has the value of 0.1, the learning rate will grow too fast to ensure stable
convergence. On the other hand, the learning rate reduces too slowly when y is 0.02.
Thus the above optimal settings of the control parameters results in a near optimal
learning rate for the local terrain.
Figure 4.1 Architecture graph of a multiplayer neural network with two hidden layers
signals
\ Activation
Synaptic Weights
Figure 4.2 A neuron model
Figure 4 3 Sigmoid activation kction
Figure 4.4 Illustration of the directions of two passes:
Forward propagation pass and Back-propagation pass
I tnitidizc weights w, and v4 1
Figure 4.6 Scheme 1 : incremental updating flowchart
Start of a new training cycle f b 4
Start of a new training step
E - 0
v f \
Feed input xi and compute layer's output
b J
P 3
Compute error hction 1 2
E t E + - ~ ( O I ' - ~ ) < 2 &-I - J
v f \
Adjust weights of output layer
Auk, = r l ( ~ ' - &h)a(l- \ 4
v f l \
Adjust weights of hidden laya K
AW~ = q ~ ( 1 -s)ux[(~ -a)a(l -a))ij]
&==I \ I
A
No More samples Yes #
I initiatii Weights wjt ve I
Figure 4.7 Scheme 2: batch updating flowchart
Start of a new training cycle f
Start of a new training step C 4
v I 3
Feed input xi and compute layer's output
,!! =f($w)
a = f(g Vk,Yj)
k /
f \ Compute error function
1 " : 2
E t E+-x(a* -aI) 2 h, #
Ys,
No
v I 9
Adjust weights of output layer
Ava = dot' - h)or(l- J \
v No r \
Adjust weights of hidden layer K
= m(~-Mbu(ad -aM~-a)ro] h l
J
I 24 I hidden n units
Figure 4.8 The neural network used for non-linear mapping
Note: The network has 19 input layer nodes, 24
hidden layer nodes, and 2 output layer nodes.
Table 4.1 Performance comparison of hidden layer with different size
- j Number of hidden nodes 20 24 26
CHAPTER FIVE:
BEARING DEFECT DIAGNOSIS
5.1 Experimental Studies
The developed method is applied to diagnose the defects of the tapered roller
bearings used in railroad fkight cars. To train the neural network for the non-linear
mapping, we used a total number of 1 15 samples with known defect information operated
under various conditions such as different loads and speeds. Severity of the defects is also
reflected by single vs. multiple spas. Since the present work intends to focus on bearing
failure due to fatigue spalls, we decide to use samples representing the following
conditions:
Table 5.1. Bearing conditions represented with class numbers
These data were provided through NRC by the Association of American Railroads
(AAR). A bearing test rig has been set up in the Transportation Technology Center (TTC)
of the AAR. Figure 5.4 illustrates the laboratory roller bearing test rig. The roller bearing
mounted in the test rig is clearly shown in Figure 5.5. The test m g s used in the
laboratory tests include both AP class E (6 x 11) 70-ton capacity bearings, and AP Class
F (6 1R x 12) 100-ton capacity bearings. The component dimensions of these two types
of bearings are described in Table 5.2.
f Clm Numbor 1- 2
a
3 4
f 5 6
Boaring Conditions Good Bearing
Single Cup Spall Multiple Cup Spalls (Figure 5.1)
Single Cone Spall Multiple Cone Spalls (Figure 5.2)
Broken Roller Figure 5.3)
97
Each AP class bearing (EBtF) are embedded with defects of different types as listed
in Table 5.1. Experiments are performed with two separate radial loads representing
empty and Mly loaded fieight car:
Type E bearings: 8,000 lb. and 27,500 lb.
Type F bearings: 8,000 ib. and 33,000 lb.
Each test is conducted at different train speeds ranging fkom 30 to 80 miles per hour
(MPH) at increment of 10 miles per hour.
Test data are collected from acoustic sensors and accelerometers in parallel for all
bearings under test. Analog signals are digitized with a sampling rate of 270 kHz. The
digital signals are stored in files, each of which contains 540,000 points representing 2
seconds of signal collection time. Tachometers are also used to measure the exact shaft
rotation speed to provide a reference for synchronization.
5.2 Feature Selection
The obtained vibration signals are processed and analyzed through time domain,
fiequency domain and segmentation analyses. A total number of nineteen feature
parameters are calculated for measured signals. Time domain parameters include Root
Mean Square value (Rv), Peak value (Pk), Crest factor (Cf), Kurtosis value (Kv),
Clearance factor (Clf) and Impulse factor (If). They can be used to indicate either the
severity of the bearing defects or the spikiness of the vibration amplitude associated with
the defect-induced impulses.
Frequency index (Fi) is the parameter extracted h m fiequency domain proposed to
highlight significant fiequency contents that may be associated with the bearing defect
characteristics hquencies [Sun, et al., 19981. Although the defect characteristic
fkequencies could be used to help determine the location of the defect, automatic
98
detection of impulses at these fkequencies is not a simple task. This is because frequency
spectra often show much stronger peaks at much higher fiequencies representing high
order structural resonance compared to those at the characteristic fiequencies. Vibrational
energy of the bearing spreads across a wide fkquency band and can be easily buried in
the noise. Figure 5.6 shows the fhquency spectrum of a bearing with outer race defect.
The dominant fkequency can be seen to be around 4300Hz which is far beyond the range
of the roller passing outer race fkquency as shown in Table 2.1. No explicit relations
between the spectrum and the defect characteristic kquencies can be constructed in the
case. Therefore, it is not advised to depend solely on the fiequency spectrum, which
necessitates the pattern recognition analysis for more reliable diagnosis.
Segmentation analysis is applied to characterize non-stationary signals through
segmenting the signal into quasi-stationary components based on the understanding of
bearing dynamics. Impulses can only be generated from the passage of defects when they
are inside the load zone. Defects located on different bearing components will generate
impacts with different fkquencies and modulation patterns when passing through the
load zone. Correlation exists between the location of the defects and the impulse patterns
observed in the vibration signal. We decide to divide the signal in one shaft or cage
revolution into six segments, so that at least one segment will be completely inside the
load zone and one completely outside of the load zone. Segmentation parameters are
determined based on the calculation of standard deviation of the time domain parameters
in various segments using cage hquency and shaft fkquency respectively. A segmented
vibration signal obtained fbm bearing with h e r race defect and the spectra of each
segment are illustrated in Figure 5.7. It is obvious that the peaks in the spectra of the last
two segments are much more dominant than the those in other four segments, which
corresponds to the impulse generating region in the time domain waveform.
99
5.3 Results of the Artificial Neural Network
Nineteen parameters are first calculated for each measured signal of the 1 15 samples
to form the feature space, as listed in Table 5.3. These parameters are then normalized
and used as input to train the neural network to perform the non-linear mapping as
discussed in chapter 4. Before training, it is often usefhl to scale the inputs and targets so
that they always fall within a specified range. This preprocessing is helpful for efficient
and stable behavior of the training process. We choose to scale all numbers such that they
fall into the range of a sigmoid function, i.e., between 0 and 1. The minimum and
maximum values of each feature parameter for the total 115 samples, that is, the
minimum and maximum of each column in Table 5.3 are used to normalize the column
into the sigrnoid range. These values are also exploited in normalizing the test data for
diagnosis as detailed later. Each training data set consists of the normalized nineteen
input parameters and the specified cluster centers as target outputs of the network as
shown in Table 5.4.
An error criterion of 0.01 is achieved through a trial and error approach. The
network training was pursued for 9,000 iterations when the error criterion was reached.
The actual outputs at the end of training are compared with the target outputs and listed in
Table 5.5. The averaged error fhction Em of the trained network is calculated to be 0.009
also shown in the table, which fkther co- that the learning has converged to the
expected criterion. If the error criterion is chosen to be 0.005, the network converges after
16,000 iterations and only leads to 5% reduction of the error hct ion Em. An error
criterion of 0.02 was also tested, the network learning converged after 6,000 iterations.
However, with the value of Em being 0.019 the samples belonging to different classes
were not well clustered in the classification space and had some overlapping.
Feature extracttion without segmentation parameters fiom the same bearing was
performed to compare with the developed method. The same experimental data were
100
used for the pattern recognition. The learning process took the same network more than
40,000 iterations to converge to the same error criterion.
After the network training was complete, the actual network outputs were plotted on
a 2D space and the mapping result is shown in Figure 5.8, where the Arabic numbers
represent different bearing conditions as listed in Table 5.1. The black dots in the
classification space represent the designated cluster centers. Although their arrangement
was somewhat arbitrary, we placed them evenly on a unit circle in the first quadrant of a
2D space as shown in Figure 5.9. There were three reasons for this configuration. Firstly,
a unit circle was chosen so that the outputs will fall into the range of a sigmoid function,
that is, between 0 and 1. Secondly, a larger circle does not necessarily lead to a better
clustering effect. In fact, although the between-class distance may increase with the
diameter, the within-class distance may also increase. Consequently, class separability
will not be improved. The third reason states that if cluster centers were arrayed on a unit
square in the first quadrant of a 2D space as shown in Figure 5.10, the convergence of the
network took longer. Also, some regions are left unexplored because the mapped samples
could not distribute evenly in the first quadrant. Finally, the coordinates of the cluster
centers, that is, the desired [xc are chosen to be:
It can be observed from Figure 5.8 that samples belonging to different classes are
separated in different regions and clustered around their own pre-defined cluster centers
in the classification space. The neural network has successfully performed the high
fidelity dimension reducing non-linear mapping. The intra-class transformation [Sun, et
101
al., '19981 is eliminated thereby. Simple piecewise linear classifications can then be
applied to partition the classification space.
5.4 Classification
Once the sample data have been transformed h m the feature space to the
classification space with high fidelity, they are ready to be classified. For the present
study, we used a distribution h e classification method due to the deficient knowledge of
the bearing defect distribution. Discriminant hc t ions are used to partition the
classification space.
Consider K classes: S,, . . . , Sk,. . ., SK with defining prototypes y,,,(k' for each class rn =
1, . . ., ktk. The discriminant bc t ion is defined such that for any point z belonging to Sk,
there exists a function gdz) such that
g k ( ~ ) > ~ ( ~ ) VZ €St and tlk* j (5-1)
In other words, within the region Sk, the kth discriminant function will have the
largest value. For linearly separable patterns, it is convenient to use piecewise linear
discriminant hctions. If we d e h e the distance of a point z to a class Sk to be the
distance fiom the closest prototype point in Sk, that is,
We could define the above to be the discriminant function. Therefore, the decision will be
made based on the smallest distance between a point in the classification space to any
class.
Mathematically, this can be written as:
Accordingly, the discriminant hct ion can be defined as:
Appareny in a 2D space, boundaries are defined when two functions @(z) and
g j ( ~ ) both become maximum and
= d=) (5.5)
Figure 5.1 1 shows patterns of the 123 samples in the classification space. Boundaries
are generated as described above. Once classification space is constmcted, it can be used
for diagnosis.
5.5 Diagnosis
After completing the pattern classifier, a total number of 31 test data (not used in
training) from bearings with defects of different types as listed in Table 5.1 were used to
test the effectiveness of the developed method. They were taken under different loads and
at different speeds.
By calculating the time domain, fnquency domain and segmentation parameters, a
point can be located in the feature space for each measured signal. Table 5.6 shows the
calculated feature parameters for the 31 test data. The minimum and maximum values
used to normalize the feature parameters of training samples are also adopted to
normalize those of the test data. The normalized parameters are listed in Table 5.7 and
fed through the trained neural network. The network outputs corresponding to each
103
measured signal were plotted in the classification space denoted by different symbols as
shown in Figure 5.11. The results show that all the testing data were correctly recognized.
The developed methdd is very effective in bearing defect diagnosis.
S/N 54900 *. 3 UULTIRECUPSPUW TEST BRG #9 b
Figure 5.1 Multiple cup spalls
Fi gum 5 2 Multiple cone spalls
F w In #
Figure 5.6. Frequency spectrum versus bearing defect characteristic fraluencies
Figure 5.5 Roller bearing mounted in test rig
Time domain waveform in one rewolution divided into six segments
Fmquency spectrum of the waveform in one revolution
Figure 5.7a. An time/fiequency display of the signal in one revolution
Figure 5.8 Result of nonlinear mapping using neural networks " 1 " - Good Bearing "2" - Single Cup Spall "3 " - Multiple Cup Spalls "4" - Single Cone Spall "5" - Multiple Cone S p a s "6" - Broken RoUer
Figure 5.9 Cluster centers evenly spaced on a unit circle
1
Figure 5.10 Cluster centers arrayed on a unit square
Table 5.2 Bearing component dimensions
bem Description
Sae Designation (k hes)
Wpical Carload pns) Number of RPlem
Cbler lenglh lbler D&mebr
lblerPRh DiameBr Cone BOR (DiaMBOer)
Cup OD @--D)/2 Bearing HMlh
1/2 -hided Cup Angle (deg) Cos(Angle)
hches
'Ibm Num
hches hches hches hches hches hches hches
Deg Beta
E
6 x l l
;FD a4 m am470 7.- S68;F#)
7.U3725 6-
10 QM
F
61/2xl2
100 23
lS530 .
7.- -6-
AS3750 80622s 7.00000 3.0
CHAPTER SIX:
CONCLUSION AND FUTURE WORK
6.1 Summary of Results Obtained
The signal processing and pattern recognition techniques described in chapters 3,4,
and 5, were applied to vibration signals obtained from rolling element bearings used in
railway hight cars.
Vibration signals obtained from bearings were processed and analyzed through time
domain, fiequency domain and segmentation analyses. T i e domain parameters include
Root Mean Square value (Rv), Peak value (Pk), Crest factor (Cf), Kurtosis value (Kv),
Clearance factor (Clf) and Impulse factor (If). They provide information such as the
spikiness and the energy level of the vibration signals. Frequency index (Fi) is the
parameter extracted from kquency domain proposed to highlight significant frequency
contents that may be associated with the bearing defect characteristics fkquencies.
Earlier work on pattern recognition for bearing defect diagnosis using these parameters
showed promising results and was simple to implement [Sun et al., 19981.
The sensitivity and reliability of the pattem recognition analysis is fUrther improved
by including the newly developed segmentation parameters. Since the vibration signal of
a bearing with defects is generally non-stationary, segmentation d y s i s can be applied
to feature the description of such a signal through segmented quasi-stationary
components. Based on the observation that impuises can only be generated from the
passage of the defect contacting the mating surfaces under load, vibration signals present
certain patterns associated with def- inside or outside of the bearing load zone. Defects
located on different bearing components will generate impacts when passing through the
load zone with different fkquencies and modulation patterns. A correlation exists
119
between the variation pattern of signals and the location of defects on the bearing
components and impulses modulated with the shaft or cage frequency can be detected.
For the bearings studied in this work, the radial loads cause a stress distribution
over an angle range of about 120 degrees. A fixed-length segmentation scheme is used in
order to reduce the computational expense of the process. Signals in one shaft revolution
and cage revolution are evenly divided into six segments respectively so that at least one
segment will be completely inside the load zone and at least one segment can be
completely outside of the load zone. Descriptive features of these signal segments can be
calculated through time domain parameters. Segmentation parameters are thus
determined based on the calculation of standard deviation of the time domain parameters
among six segments. They can directly reflect the variation of vibration patterns
associated with the bearing load zone. The segmentation parameters referenced in both
shaft and cage rotations are employed to participate in constructing the feature space.
Segmentation parameters, together with the existing time and fkequency domain
parameters are used to construct the feature space. Thus the find feature space is
composed of 19 dimensions. A three-layered artificial neural network is used to
accomplish the nonlinear mapping fiom the 19-dimensional feature space to the 2-
dimensional classification space. Artificial neural networks allow us to construct
complicated non-linear relations between input and output when analytical description is
not available.
The ANN is chosen to have three layers since three-layered networks are
s a c i e n t for representing the non-linear relations between the input and output and they
have relatively simple architecture. The error back-propagation algorithm is used to train
the neural network. The same sigmoid activation h c t i o n in the form of a logistic
function is chosen for all the hidden and output neurons of the network as it allows the
network to learn non-linear relationships between input and output vectors. The nineteen
120
feature parameters extracted fiom vibration signals are fed as inputs to the network and
the output contains the corresponding coordinates in the 2D classification space. The
cluster centers are evenly spaced on a unit circle in the first quadrant of a 2D space in
order to locate the desired outputs associated with each of the classes. The neural network
is trained with known input sets each of which consists of 19 parameters and the
corresponding desired outputs that are the specified cluster centers. The finally obtained
optimal architecture of the network is 19-24-2 through a trial and error approach. A
heuristic technique, variable learning rate, is adopted in order to accelerate the
convergence of back propagation learning. A momentum is also incorporated to b l p
speed up convergence, and achieve a more efficient and reliable learning profile.
A total number of 115 samples with known defect information operated under
various conditions such as different laods and speeds are used to train the network.
Severity of the defects is also reflected by single vs. multiple spalls. The network training
was pursued for 9,000 iterations when an error criterion of 0.01 was reached. Feature
parameters without the segmentation parameters extracted fiom the same bearing was
also investigated. It took the same network more than 40,000 iterations to converge to the
same error criterion. The mapping result shows that the corresponding outputs in the
classification space are completely separated in different regions and clustered around the
prescribed centers associated with bearing in different conditions. The artificial neural
network has successfidly performed the high fidelity dimension reducing non-linear
mapping. The intraclass transformation is eliminated thereby. Successll mapping
allows application of the simple piecewise linear boundaries.
A total number of 3 1 test data (not used in training) fiom bearings with defects of
different types as listed in Table 5.1 were used to test the effectiveness of the developed
method. They are taken fiom bearings under different operating conditions. By
calculating the time domain, frequency domain and segmentation parameters of the
121
signals, these samples can be located in the feature space. Fed through the trained neural
network, each test sample is identified on the classification space. The classification
results show that they are all correctly recognized.
In summary, the developed method based on pattern recognition analysis has
improved the sensitivity and reliability in bearing fault diagnosis by including the
segmentation parameters. The successful non-linear mapping through the neural network
eliminates intra-class transformation process. The result shows the method is simple and
effective. It is suitable for the development of automatic monitoring and diagnostic
systems.
6.2 Limitations of the Present Method and Directions for Future Work
Although the present method accurately has recognized bearing conditions, the
results were obtained using experimental data. In actuality, we deal with bearings
mounted on moving trains. Vibration signals obtained fiom this environment are
expected to have diEerent characteristics than those obtained fiom a test rig in the lab.
Future work will be directed towards investigating the reliability of the existing method
diagnosis. Improvement, if any, needs to be made to further increase the sensitivity of the
method to non-defect related characteristics.
Also in this work, we focused on detecting and diagnosing bearing defects caused
by fatigue spalling as it is the predominant bearing failure mode. Further studies need to
be conducted to include other bearing failure modes.
REFERENCES
R.J. ALfiredson and J. Mathew, "Frequency domain methods for monitoring the condition
of rolling element bearingsy'. Mechanical Engineering Transactions, Vol. ME10, No.2,
The Institution of Engineers, Australia, July 1985(a), pplO8- 1 12.
R. J. Alhdson and J. Mathew, "Time domain methods for monitoring the condition of
rolling element bearings". Mechanical Engineering Transactions, Vol. ME 10, No.2, The
Institution of Engineers, Australia, July 19850, ppl02-117.
I. E. Alguindigue, Anna Loskiewicz-Buczak, and Robert E. Uhrig, "Monitoring and
Diagnosis of Rolling Element Bearings Using Artificial Neural Networks". IEEE
Transactions on Industrial Electronics, VOL. 40, NO. 2, pp. 209-217, April 1993.
I. E. Alguindigue and Robert E. Uhrig, "Automatic Fault Recognition in Mechanical
Components Using Coupled Artificial Neural Networks". IEEE International Conference
on N e d Networks, Part 5 (of 7), Jun 27-29, VOL. 5, 1994, Orlando, FL, USA, pp
3312-3317.
G. B. Anderson, J. E. Cline, D. H. Stone, and R. L. Smith, "A New Detection Technique
to Identify Defective Railroad Bearings". American Society of Mechanical Engineering,
Rail Transportation Division (Publication) RTD Rail Transportation, Proceedings of the
1996 ASME International Mechanical Engineering Congress and Exposition, Nov 17-22,
1996, Vol. 12.
Y. A. Azovtsev, A. V. Barkov, and I. A. Yudin, "Automatic diagnostics and condition
prediction of rolling element bearing using enveloping methods". Vibration Institute lgh
Annual Meeting, June 20-23,1994.
123
D. C. Baillie and J. Mathew, "A Comparison of Autoregressive Modeling Techniques
for Fault Diagnosis of Rolling Element Bearingsy'. Mechanical Systems and Signal
Processing, VOL. 10, NO. 1, Jan 1996, pp 1-17.
R H. Bannister, "A review of rolling element bearing monitoring techniques".
1.MECH.E Conference on Condition Monitoring of Machinery and Plant, 1985, ppl l-24.
B. G. Batchelor, "Pattern Recognition". Plenum Press, New York, 1978.
G. Bodenstein and H. M. Praetorious, "Feature extraction from the electroencephalogram
by adaptive segmentation", Proceedings of the IEEE, V65, No. 5, pp642-652, May 1977.
S. Braun and B. Datner, "Analysis of rollerhall 'bearing vibrations". ASME J.
Mechanical Design, 1979, 101, pp. 1 18-125.
Y. Chiou, Massoud S. Tavakoli, and Steven Liang, "Bearing fault detection based on
multiple signal features using neural network analysis". Proc. 10th Int, Modal Analysis
Cod., San Diego, CA, 1992, pp. 60-64.
H. C. Choe, Yulun Wan, and Andrew K. Chan, "Neural Pattern Idenscation of Railroad
Wheel-bearing Faults from Audible Acoustic Signals: Comparison of FFT, CWT, and
DWT features". Proceedings of SPIE - The Intenrational Society for Optical Engineering
Wavelet Applications IV, VOL. 3078, April 22-24, 1997, Orlando, FL, USA, pp 480-
496.
J. Courrech, 'mew techniques for fault diagnosis in rolling element bearings".
Proceedings of the 40' Mechanical Failures Prevention Group, Maryland, April 1985, pp
83-9 1.
124
G. Cybenko, "Approximation by superpositions of a sigmoidal hction". University of
Illinois, Urbana, 1988.
D. Dyer and R. M. Stewart, "Detection of rolling element bearing damage by statistical
vibration analysis". Transactions of the American Society of Mechanical Engineers,
Journal of Mechanical Design, Vol. 100, April 1978, pp 229-235.
R. L. Eshleman, "The role of sum and difference kquencies in rotating machinery fault
diagnosis". Proceedings of the 2"d International Conference on Vibrations in rotating
machinery, 1.Mech.E. paper C272/80,1980, pp 145-149.
R L. Florom, "Improved wayside train inspection program". Town Hall Meeting,
Association of American Railroads, Chicago Technical Center, Chicago, Illinois, June
15, 1994.
P. K. Gupta, "Transient ball motion and skid in ball bearings". Transactions of the
American Society of Mechanical Engineers, Journal of Lubrication Technology, April
1975, pp 261-269.
P. K. Gupta, "Dynamics of rolling element bearings Part I: Cylindrical Roller Bearing
Analysis". Transactions of the American Society of Mechanical Engineers, Journal of
Lubrication Technology, Vol. 10 1, July 1979(a), pp 293-304.
P. K. Gupta, c4Dynamics of rolling element bearings Part XI: Ball Bearing Analysis".
Transactions of the American Society of Mechanical Engineers, Journal of Lubrication
Technology, Vol. 101, July 1979(b), pp 305-3 1 1.
125
P. K. Gupta, "Dynamics of rolling element bearings Part ID: Ball Bearing Analysis".
Transactions of the American Society of Mechanical Engineers, Journal of Lubrication
Technology, Vol. 10 1, July 1979(c), pp 3 12-3 18.
P. K. Gupta, "Some dynamic effects in high-speed solid-lubricated ball bearings".
Proceedings of the ASLW ASME Lubrication Conference, New Orleans, Louisiana,
October 5-7, 198 1.
P. K. Gupta, J. F. Dill, and H. E. Bandow, "Dynamics of rolling element bearings:
Experimental validation of the DREB and RAPIDREB computer programs".
Transactions of the American Society of Mechanical Engineers, Journal of Tribology,
Vol. 107, Jan 1985, pp 132-137.
0. G. Gustafsson and T. Tallian, "Detection of damage in assembled rolling bearings",
Transaction of the American Society of Lubrication Engineers, Vol. 5, 1962, pp 197-209.
L. G. Hampson, "Diagnostic checks for rolling bearings". Proceedings of a seminar
organized by the Institute of Mechanical Engineers, Rolling Element Bearings, Feb. 22,
1983.
Simon Haykin, " N e d networks - a comprehensive foundation". Macmillan College
Publishing Company, NJ, USA, 1994.
M. J. Hine, "Absolute ball bearing wear measurements from SSME turbopump dynamic
signals". Journal of Sound and Vibration, Vol. 128, No. 2, 1989, pp 3 2 1 -3 3 1.
P. S. Houghton, "Ball and Roller Bearings". Applied Science Publishers Ltd, Ripple
Road, Barking, Essex, England.
126
I. M. Howard and G. W. Stachowiak, "Detection of surface defects in rolling contact
bearings". The Institution of Engineers, Australia, Mechanical Engineering Transactions,
Vol. 13, 1989, pp 158-164.
I. Howard, "A review of rolling element bearing vibration - detection, diagnosis and
prognosis", Defense Science and Technology Organization, Australia, October, 1994.
H. Huang and H. P. - Ben W a g , "An Integrated Monitoring and Diagnostic System for
Roller Bearings". The International Journal of Advanced Manufacturing Technology,
VOL. 12, No. 1, pp. 37-46, 1996.
T. Igarashi and H. Hamada, "Studies on the vibration and sound of defective rolling
element bearings (Third Report: Vibration of ball bearing with multiple defects)".
Bulletin of the Japanese Society of Mechanical Engineers, Vol. 28, No. 237, March 1985,
pp 492-499.
V. Jammu and Th. Walter, "Standoff Bearing Fault Detection Using Directional
Microphones and Unsupe~sed Neural Networks", Shock and Vibration Digest, VOL.
29, 1997, pp 17-25.
M. Karkoub and Ali Elkarnei, "Modelling pressure distribution in a rectangular gas
bearing using neural networks". Tribology International, Vo1.30, N0.2, 1 99 1, pp. 1 39- 1 50.
J. E. Keba, "Component test results h m the bearing Life improvement program for the
space shuttle main engine oxidizer turbopumps". Proceedings of the 3d International
Symposium on Rotating Machineryy Honolulu, HIy 1990, pp 303-3 1 8.
127
P. E. Keller, Richard T. Kouzes, Lars J. Kangas, Sherif Hashem, "Neural network based
sensor systems for manufacturing applications". Advanced Information Systems and
Technology Conference, Williamsburg, VA, USA, 28-30 March, 1994.
R. J. Kershaw, "Machine Diagnostics with Combined Vibration Analysis Techniques".
Proceedings of the 41' Meeting of the Mechanical Failures Prevention Group, October
1986, pp 160- 168.
A. F. Khan, "Condition monitoring of rolling element bearings: A comparative study of
vibration based techniques". PLD. Thesis, University of Nottingharn, May 1991.
S. Krishnan, "Adaptive Filtering, Modeling, and Classification of Knee Joint
Vibroarthrographic Signals ". Masters Thesis, University of Calgary, Canada, 1996.
C. J. Li and S.M. Wu, "On-line detection of localized defects in bearings by pattern
recognition analysis". Transactions of the ASME, Journal of Engineering for Industry,
Nov. 1989, Vol. 1 1 1, pp 33 1-336.
C. Q. Li and C. J. Picker@, "Robustness and sensitivity of non-dimensional amplitude
parameters for diagnosis of fatigue spalling". Condition Monitoring and Diagnostic
Technology, Vol. 2, No. 3, January 1992, pp 8 1-84.
T. I. Liu and J. M. Mengel, "Detection of Ball Bearing Conditions by An A. I.
Approach". American Society of Mechanical Engineers, Production Engineering
Division (Publication) PED Sensors, Controls, and Quality Issues in Manufacturing
Winter Annual Meeting of ASME, Dec 1-6, VOL. 55, 199 1, pp 13-2 1.
128
T. I. Liu and N. R. Iyer, ''On-line Recognition of Roller Bearing States". Proceedings of
the 1992 Japan - USA Symposium on Flexible Automation, Part 1 (of 2), Jul 13-1 5,
1992, San Francisco, CA, USA, pp 257-262.
J. Mathew, "Machine condition monitoring using vibration analysis". Journal of the
Australian Acoustical Society, VO~. 15, No. 1, 1987, pp 7- 13.
J. Mathew, "Monitoring the Vibrations of Rotating Machine Elements - An Overview".
the Bulletin of the Center of Machine Condition Monitoring, Monash University, VOL.
1, No. 1, 1989, pp 2.1-2.13.
J. Mathew and R. J. Alfkdson, "The condition monitoring of rolling element bearings
using vibration analysis", ASME Transactions, Journal of Vibration, Acoustics, Stress
and Reliability in Design, Vol. 106, July 1984, pp. 447-453.
P. D. McFadden, "Condition monitoring of rolling element bearings by vibration
analysis". Proceedings of the Institution of Mechanical Engineers, Jan. 1990, pp 49-54.
P. D. McFadden and J. D. Smith, "Model for the vibration produced by a single point in a
rolling element bearing". Journal of Sound and Vibration, Vol. 96, No. 1, 1984(a), pp 69-
82.
P. D. McFadden and J. D. Smith, "Vibration monitoring of rolling element bearings by
the high frequency resonance technique - a review", Tribology International, Vol. 17,
No. 1, Feb. 1984(b), pp 3-10.
129
P. D. McFadden and J. D. Smith, "Model for the vibration produced by multiple point in
a rolling element bearingyJ. Journal of Sound and Vibration, Vol. 98, No. 2, 1985, pp 263-
273.
S.W. McMahon, "Condition monitoring of bearing using ESP". Condition Monitoring
and Diagnostic Technology, Vol. 2, No. 1, July 199 1, pp 2 1-25.
C. K. Mechefske, "Machine Condition Monitoring: Part 2 - the effects of noise in the
vibration signal". British J o d of NDT, Vo1.35, No. 10, Oct. 1993, pp 574-579.
C. IS. Mechefske, "Machine Condition Monitoring: Part 2 - the effects of noise in the
vibration signal". British Journal of NDT, Vo1.35, No. 10, Oct. 1993, pp 574-579.
C. K. Mechefske and J. Mathew, "Parametric Spectral Estimation to Detect and Diagnose
Faults in Low Speed Rolling Element Bearings". The Bulletin of the Center of Machine
Condition Monitoring, Monash University, Melbourne, Australia, pp 108- 1 14, 199 1.
C. K. Mechefske and J. Mathew, 'Fault detection and diagnosis in low speed rolling
element bearings Part I: The use of parametric spectra". Mechanical Systems and Signal
Processing, Vol. 6, No. 4, 1992, pp 297-307.
D. Michael and J. Houchin, "Automatic EEG analysis: A segmentation procedure based
' on the autocorrelation function", Electroenceph. Clin. Neurophysiol, V46, pp232-235,
1979.
A. J. Mundin and A. J. Penter, "An on-line vibration analysis system for a marine gas
turbine". Proceedings of 1.MECH.E. Confmnce on Vibrations in rotating machinery,
University of Bath, Sep.7-10, 1992, pp 441-449.
130
A. Naumann, "A neural network for well completion diagnosis in the petroleum
industry". 1 990.
J. P. Peck, John Burrows, "On-line condition monitoring of rotating equipment using
neural networks". ISA Transactions 33,1994, pp. 159-164.
F. J. Pineda, "Generalization of backpropagation to recurrent and higher order neural
networks". In Neural Information Processing Systems @. 2. Anderson, ed.), 1988, pp.
602-61 1. New York: American Institute of Physics.
H. Prashad, M. Ghosh and S. Biswas, "Diagnostic monitoring of rolling-element bearings
by high-frequency resonance technique". ASLE Transactions, Vol. 28, No. 4, 1985, pp
43 9-448.
R. B. Randall, "Computer aided vibration spectrum trend analysis for condition
monitoring". Maintenance Management Intemational, Vol. 5, 1 985, pp 1 6 1 - 1 67.
R. B. Randall, CCXntroduction to condition monitoring". Journal of the Australian
Acoustical Society, V01.18, NO. 1, 1990, pp15-18.
M. Semdge, "What makes vibration condition monitoring reliable". Noise and Vibration
Worldwide, Sep. 199 1, pp 17-24.
X. 2. Shi, 2. Q. Xu, and M. Xu, "A Study of the Automatic Recognition of Vibration
Signal for Ball Bearing Faults - the FFT-AR Feature Extraction and Classification
Method". Proceedings of IEEE Intemational Workshop on Applied Time Series
Analysis, 1 98 8.
1993 SKF Condition Monitoring, Inc. "Acceleration envelope in paper machines".
R. L. Smith, "Railcar bearing eud-life failure distances and acoustical defect censuring
methods". Proceedings of the ASME Winter Annual Meeting, Chicago, Illinois, Nov. 27-
Dec. 2, 1988.
R. L. Smith, "Rolling element bearing diagnostics with lasers, microphones and
accelerometersy'. Proceedings of the 46* Meeting of the Mechanical Failures Prevention
Group, San Diego, California, April 1992, pp 43-52.
R. L. Smith and J. Bambara, "Acoustic detection of defective rolling element bearings".
Proceedings of the 43" Meeting of the Mechanical Failures Prevention Groups, San
Diego, California, Oct. 1988, pp 79-88.
R. L. Smith and T. J. Walter, "Machinery Fault Identification Using Microphones". pp
93-100,1993.
Y. T. Su and S. J. Lin, "On initial fault detection of a tapered roller bearing: Frequency
domain analysis". Journal of Sound and Vibration, Vol. 155, No. 1, 1 992, pp 75-84.
Y. T. Su, Y. T. Sheen, "On the detectability of roller bearing damage by fkequency
analysis". Proceedings of the Iastitute of Mechanical Engineers, Part C: Joumal of
Mechanical Engineering Science, Vol. 207,1993, pp.23-32.
M. Subrahmanyam and C. Sujatha, ''Using N e d Networks for the Diagnosis of
Localized Defects in Ball Bearings". Tribology International, VOL. 30, NO. 10, 1997, pp
739-752.
132
G. P. Succi, ccPmgnostic methods for bearing condition monitoring". Proceedings of the
International Machinery Monitoring and Diagnostics Conference, Las Vegas, Nevada,
Dec. 1991, pp 335-342.
Q. Sun, F. Xi, and G. Krishnappa, "Signature Analysis of Rolling Element Bearing
Defects", Proceedings of CSME Forum, pp. 423-429, Toronto, 1998.
Q. Sun, F. Xi, P. Chen, G. Krishnappa, "Bearing Condition Monitoring Through Pattern
Recognition Analysis", the 6th International Conference on Sound and Vibration,
Denmark, 1999.
N. Tandon, "A comparison of some vibration parameters for the condition monitoring of
rolling element bearings". Journal of The International Measurement Conference,
Dec. 1994, pp 285-289.
S . Tavathia, R. M. Rangayyan, G. D. Bell, K. 0. Ladly, and Y. Zhang, "Analysis of knee
vibration signals using linear prediction". IEEE Transactions on Biomedical Engineering,
Vol. 39, No. 9, September 1992, pp.959-970.
J. I. Taylor, "Identification of B d g Defects by Spectral Analysis". Transactions of the
ASME, Journal of Mechanical Design, VOL. 102, April 1980, pp 199-204.
A. Unal, "Feature Article , Intelligent Diagnostics of Ball Bearings". The shock and
vibration digest, N o v . ~ . , pp. 9-12, 1994.
X. F. Wan& X. 2. Shi, and M. Xu, "The fault diagnosis and quality evaluation of ball
bearing by vibration signal processhg". Proceedings of the in International Machinery
Monitoring and Diagnostics Conference, Nov 1998, pp 3 18-32 1.
133
G. White, ''Amplitude demodulation - a new tool for predictive maintenace". Sound and
vibration, Sep. 1991, pp 14-19.
D. F. Wilcock and E. R. Booser, "Bearing design and application". McGraw-Hill Book
company, Inc. hinted in the United States of America, 1957.
P. J. Werbos, "Beyond regression: New tools for prediction and analysis in the behavior
sciences". PbD. Thesis, Harvard University, Cambridge, MA, USA, 1974.
C. Zhu, and F. W. Paul, "A Fourier Series neural network and its application to system
identification". Journal of Dynamic Systems, Measurement, and control, Transactions of
the ASME, September 1 995, Vol. 1 17, pp. 253-26 1.