MACHINE PROGNOSTICS BASED ON HEALTH STATE …eprints.qut.edu.au/41739/1/Hack-Eun_Kim_Thesis.pdf · 3.4.3 Prediction of Machine Remnant Life ... and prognostics system based on health

MACHINE PROGNOSTICS BASED ON HEALTH STATE PROBABILITY ESTIMATION

Hack-Eun Kim Master of Engineering (Mechanical)

Bachelor of Engineering (Material)

Thesis submitted in total fulfilment of the requirements of the degree of

Doctor of Philosophy

SCHOOL OF ENGINEERING SYSTEMS

FACULTY OF BUILT ENVIRONMENTAL ENGINEERING

QUEENSLAND UNIVERSITY OF TECHNOLOGY

2010

- i -

ABSTRACT

The ability to accurately predict the remaining useful life of machine components

is critical for machine continuous operation and can also improve productivity and

enhance system’s safety. In condition-based maintenance (CBM), maintenance is

performed based on information collected through condition monitoring and

assessment of the machine health. Effective diagnostics and prognostics are

important aspects of CBM for maintenance engineers to schedule a repair and to

acquire replacement components before the components actually fail. Although a

variety of prognostic methodologies have been reported recently, their application in

industry is still relatively new and mostly focused on the prediction of specific

component degradations. Furthermore, they required significant and sufficient

number of fault indicators to accurately prognose the component faults. Hence,

sufficient usage of health indicators in prognostics for the effective interpretation of

machine degradation process is still required. Major challenges for accurate long-

term prediction of remaining useful life (RUL) still remain to be addressed.

Therefore, continuous development and improvement of a machine health

management system and accurate long-term prediction of machine remnant life is

required in real industry application.

This thesis presents an integrated diagnostics and prognostics framework based on

health state probability estimation for accurate and long-term prediction of machine

remnant life. In the proposed model, prior empirical (historical) knowledge is

embedded in the integrated diagnostics and prognostics system for classification of

impending faults in machine system and accurate probability estimation of discrete

degradation stages (health states). The methodology assumes that machine

degradation consists of a series of degraded states (health states) which effectively

- ii -

represent the dynamic and stochastic process of machine failure. The estimation of

discrete health state probability for the prediction of machine remnant life is

performed using the ability of classification algorithms.

To employ the appropriate classifier for health state probability estimation in the

proposed model, comparative intelligent diagnostic tests were conducted using five

different classifiers applied to the progressive fault data of three different faults in a

high pressure liquefied natural gas (HP-LNG) pump. As a result of this comparison

study, SVMs were employed in heath state probability estimation for the prediction

of machine failure in this research.

The proposed prognostic methodology has been successfully tested and validated

using a number of case studies from simulation tests to real industry applications.

The results from two actual failure case studies using simulations and experiments

indicate that accurate estimation of health states is achievable and the proposed

method provides accurate long-term prediction of machine remnant life. In addition,

the results of experimental tests show that the proposed model has the capability of

providing early warning of abnormal machine operating conditions by identifying the

transitional states of machine fault conditions. Finally, the proposed prognostic

model is validated through two industrial case studies. The optimal number of health

states which can minimise the model training error without significant decrease of

prediction accuracy was also examined through several health states of bearing

failure. The results were very encouraging and show that the proposed prognostic

model based on health state probability estimation has the potential to be used as a

generic and scalable asset health estimation tool in industrial machinery.

- iii -

KEYWORDS

Diagnostics, Prognostics, Condition-Based Maintenance (CBM), Support Vector

Machines (SVMs), Health State Probability Estimation

- iv -

TABLE OF CONTENTS

ABSTRACT ························································································ ⅰ

KEYWORDS ······················································································ ⅲ

TABLE OF CONTENTS ··································································· ⅳ

LIST OF TABLES ·············································································· ⅷ

LIST OF FIGURES ············································································ ⅸ

NOMENCLATURE ··········································································· ⅹⅱ

STATEMENT OF ORIGINALITY ················································· ⅹⅵ

ACKNOWLEDGEMENTS ······························································· ⅹⅶ

CHAPTER 1 INTRODUCTION ······················································ 1 1.1 Problem Statement ··································································· 4 1.2 Objective of Research ······························································ 5 1.3 Scope of Research ··································································· 6 1.4 Originality and Contribution···················································· 7 1.5 Organization of Thesis····························································· 10

CHAPTER 2 RESEARCH BACKGROUND AND LITERATURE REVIEW ······························································································ 13

2.1 Historical Maintenance Strategies and Philosophies ··············· 13 2.2 Key Aspects for Effective Implementing of CBM ·················· 16

- v -

2.3 Existing Data Processing Techniques ································· 18 2.3.1 Time-domain techniques ····························································· 19 2.3.2 Frequency-domain techniques ····················································· 23 2.3.3 Time-frequency techniques ························································· 25

2.4 Existing Methods for Fault Diagnostics ····························· 29 2.4.1 Data-driven Approaches······························································ 29

2.4.1.1 Statistical Approaches ···························································· 30 2.4.1.2 Artificial Intelligence (AI) Approaches ·································· 33

2.4.2 Model-Based Approaches ··························································· 36 2.4.3 Comparison of data-driven and model-based approaches ··········· 37

2.5 Current Prognostics Approaches ········································ 39 2.5.1 Data-driven Approaches for Prognostics ····································· 39

2.5.1.1 Time Series Analysis Approaches ·········································· 40 2.5.1.2 Artificial Intelligence (AI) Approaches ·································· 44

2.5.2 Model-Based Approaches for Prognostics ·································· 49 2.5.3 Reliability-Based Approaches for Prognostics ···························· 51

2.6 Remaining Challenges of Prognostics for Real Industry Application ········································································· 53

CHAPTER 3 MACHINE PROGNOSTICS BASED ON HEALTH STATE PROBABILITY ESTIMATION ········································· 57

3.1 Closed Loop Architecture for Integrating Diagnostics and Prognostics System with Embedded Historical Knowledge ··· 57

3.2 Historical Knowledge ······························································ 60 3.3 Diagnostics ·············································································· 61 3.4 Health State Estimation and RUL Prediction ·························· 63

3.4.1 Health State Classification Using SVM Classifiers ····················· 65 3.4.1.1 One-Against-All (OAA) Strategy for health state estimation · 66 3.4.1.2 One-Against-One (OAO) Strategy for health state estimation ············································································································ 68 3.4.1.3 Direct Acyclic Graph (DAG) Strategy for health state estimation ············································································································ 68

3.4.2 Health State Probability Estimation ············································ 69 3.4.3 Prediction of Machine Remnant Life ·········································· 70

3.5 Summary ················································································· 71

CHAPTER 4 COMPARATIVE STUDY ON FAULT DIAGNOSTICS USING MULTI-CLASSIFIERS ·························· 73

- vi -

4.1 HP-LNG Pumps ······································································ 73 4.2 Historical Failure Event and Data Analysis ································· 76

4.2.1 Bearing Fault ·················································································· 77 4.2.2 Rotor Bar Fault ··············································································· 80 4.2.3 Excessive rubbing of impeller wear-ring ········································· 84

4.3 Feature Calculation and Selection ··········································· 86 4.4 Brief Description of Employed Multi-Classifiers ·················· 88

4.4.1 Random Forests ·············································································· 89 4.4.2 Radial Basis Function Neural Networks (RBF-NNs) ······················ 90 4.4.3 Linear Regression ··········································································· 91

4.5 Result of Fault Classification Performance ····························· 93 4. 6 Summary ················································································ 94

CHAPTER 5 MODEL VALIDATION USING SIMULATED AND EXPERIMENTAL BEARING FAILURE DATA ······················· 95

5.1 Model Validation Using Simulated Bearing Fault Data ·········· 95 5.1.1 Simulation of Progressive Bearing Fault Data ································ 95 5.1.2 Feature Calculation and Selection ··················································· 99 5.1.3 Health State Estimation and Prediction of RUL ······························ 101

5.2 Model Validation Using Experimental Bearing Failure Data ·· 105 5.2.1 Design and Setup of Experimental Test Rig for Accelerated Bearing

Failure Test ····················································································· 105 5.2.2 Accelerated Bearing Run to Failure Test ········································ 107 5.2.3 Feature Calculation and Selection ··················································· 108 5.2.4 Health State Estimation and Prediction of RUL ······························ 109

5.3 Model Comparison Using PHM ·············································· 114 5.3.1 Proportional Hazard Model (PHM) ················································· 114 5.3.2 Prediction of Remnant Life Using PHM ········································· 116

5.4 Summary ················································································· 119

CHAPTER 6 MODEL VALIDATION THROUGH INDUSTRY CASE STUDY ····················································································· 121

6.1 Prognostics of Impeller Rubbing Failure in HP-LNG Pump ··· 121 6.1.1 Data Acquisition of Excessive Impeller Rub in HP-LNG Pump ····· 121 6.1.2 Feature Calculation and Selection ··················································· 122 6.1.3 Health State Estimation ··································································· 123 6.1.4 RUL Prediction ··············································································· 125

6.2 Prognostics of Bearing Failure in HP-LNG Pump ··················· 127

- vii -

6.2.1 HP-LNG Pump ················································································ 127 6.2.2 Data Acquisition of Bearing Failure ················································ 128 6.2.3 Feature Calculation and Selection ··················································· 131 6.2.4 Selection of Number of Health States for Training ························· 133 6.2.5 RUL Prediction of Bearing Failure ················································· 135 6.2.6 Verification of Optimum Number of Health States ························· 138

6.3 Summary ················································································· 139

CHAPTER 7 CONCLUSION AND FUTURE WORK ················· 140 7.1 Conclusion ··············································································· 140 7.2 Future Work ························································································· 144

APPENDIX ·························································································· 146

REFERENCES ···················································································· 153

- viii -

LIST OF TABLES

Table 1.1 The economical consequences of one-day stoppage in industry ············ 1

Table 4.1 Pump and Vibration Measurement Specifications ································· 75

Table 4.2 Bearing defect frequencies of HP-LNG pump ······································· 78

Table 4.3 Acquired vibration data and features for diagnostic test ························ 86

Table 4.4 Statistical feature parameters and attributed label for diagnostics ·········· 86

Table 5.1 Simulated progressive bearing degradation data set······························· 98

Table 5.2 Statistical feature parameters and attributed label from simulated data · 99

Table 5.3 Test bearing specifications for experiment ············································ 107

Table 5.4 Experimental bearing failure data set ····················································· 107

Table 5.5 Training data sets for health state probability estimation of experimental

test ·········································································································· 110

Table 5.6 Estimated parameters of PHM using experimental data 1 ······················ 117

Table 5.7 Comparison of RUL prediction between PHM and proposed model

(Closed Test using experimental data 1) ················································ 117

Table 5.8 Comparison of RUL prediction between PHM and proposed model (Open

Test using experimental data 2) ····························································· 118

Table 6.1 Acquired impeller rubbing data from the HP-LNG pump ······················ 122

Table 6.2 Training data sets for the health state probability estimation (P701D) ··· 123

Table 6.3 Pump Specifications of different type of HP-LNG pump ······················ 127

Table 6.4 Acquired vibration data of bearing failure ············································· 130

Table 6.5 Statistical feature parameters and attributed label from bearing failure data

······························································································································ 131

Table 6.6 Training data sets for the health state probability estimation (P301D) ··· 135

- ix -

LIST OF FIGURES

Figure 2.1 Taxonomy of maintenance philosophies··············································· 14

Figure 2.2 Condition-based maintenance process ·················································· 18

Figure 2.3 Comparison of the data-driven approach and the model-based approach

············································································································· 38

Figure 2.4 Illustration of artificial neural networks architecture ·························· 45

Figure 3.1 Closed loop prognostic system ····························································· 58

Figure 3.2 Flowchart of the integration of historical knowledge, diagnostic system

and prognostics system based on health state probability estimation ···· 59

Figure 3.3 Conventional feature-based diagnostics framework ····························· 61

Figure 3.4 Two health states in traditional similarity-based diagnostics and

prognostics ··························································································· 64

Figure 3.5 Illustration of discrete health states in machine degradation ················· 64

Figure 3.6 Illustration of health state probability distributions of simple linear

degradation process ·············································································· 70

Figure 4.1 Re-gasification process in LNG receiving terminal ······························ 74

Figure 4.2 Pump schematic and vibration measuring points ·································· 75

Figure 4.3 Result of historical failure event and data analysis ······························· 77

Figure 4.4 Vibration spectrum plots of five different severities of bearing fault ···· 79

Figure 4.5 Time wave form of beat vibration generated by two closely spaced

frequencies between 1X and pole passing frequency ···························· 80

Figure 4.6 True zooming spectrum plot of broken rotor bar ·································· 81

Figure 4.7 Frequency spectrum of motor current signal with broken rotor bars····· 82

Figure 4.8 Vibration spectrum plots of five different severities of rotor bar fault ·· 83

Figure 4.9 Excessive wear of impeller wear-ring and housing······························· 84

- x -

Figure 4.10 Vibration spectrum plots of five different severities of impeller rubbing

············································································································· 85

Figure 4.11 Feature selection using distance evaluation criterion for diagnostics·· 88

Figure 4.12 RBF neural networks architecture ······················································ 90

Figure 4.13 Comparison test results of five classifiers’ performance ···················· 93

Figure 5.1 Load distribution of a rolling element bearing ······································ 97

Figure 5.2 Simulated time domain signal with increasing defect impulse ············· 99

Figure 5.3 Feature selection using distance evaluation criterion (Simulation test) 100

Figure 5.4 Trends of selected features for simulation test ······································ 101

Figure 5.5 Probability distribution of each health state (Closed Test Using Simulation

Data 1) ·································································································· 102

Figure 5.6 Comparison of actual RUL and estimated RUL (Closed Test Using

Simulation Data 1) ··············································································· 103

Figure 5.7 Probability distribution of each health state (Open Test Using Simulation

Data 2) ·································································································· 104

Figure 5.8 Comparison of actual RUL and estimated RUL (Open Test Using

Simulation Data 2) ··············································································· 104

Figure 5.9 Schematic of the bearing test rig ·························································· 105

Figure 5.10 The test rig after assembly of all components ····································· 105

Figure 5.11 Close view of the middle bearing assembly ········································ 106

Figure 5.12 The picture of failed bearing after run-to-failure test ·························· 108

Figure 5.13 Feature selection using distance evaluation criterion (Experimental Test)

············································································································· 109

Figure 5.14 Trends of selected features for experimental test ································ 109

Figure 5.15 Probability distribution of each health state (Closed Test Using

Experimental Data 1) ··········································································· 111

Figure 5.16 Comparison of actual RUL and estimated RUL (Closed Test Using


Figure 5.17 Close view of the period of bearing fault condition (Closed Test Using


Figure 5.18 Probability distribution of each health state (Open Test Using


Figure 5.19 Comparison of actual RUL and estimated RUL (Open Test Using


- xi -

Figure 5.20 Close view of the period of bearing fault condition (Open Test Using

Experimental Data 2) ············································································ 114

Figure 6.1 Feature selection using distance evaluation criterion for prognostics ··· 123

Figure 6.2 Probability distribution of each health state (Closed Test, P701 D) ······ 124

Figure 6.3 Probability distribution of each health state (Open Test, P701 B) ········ 125

Figure 6.4 Comparison of actual RUL and estimated RUL (Closed Test, P701 D) 126

Figure 6.5 Comparison of actual RUL and estimated RUL (Open Test, P701 B) ·· 126

Figure 6.6 Pump schematic and vibration measurement points of different type of

HP-LNG pump ····················································································· 128

Figure 6.7 Spectrum plots of P301D pump bearing failure ···································· 129

Figure 6.8 Outer and inner race bearing failures ···················································· 131

Figure 6.9 Distance evaluation criterion of features ·············································· 132

Figure 6.10 Feature trends of selected features ······················································ 133

Figure 6.11 Result of investigation to determine optimal number of health states · 134

Figure 6.12 Probability distribution of each health state (Closed Test, P301 D) ···· 136

Figure 6.13 Probability distribution of each health state (Open Test, P301 C) ······ 136

Figure 6.14 Comparison of actual RUL and estimated RUL (Closed Test, P301 D)

············································································································· 137

Figure 6.15 Comparison of actual RUL and estimated RUL (Open Test, P301 C) 137

Figure 6.16 Training and prediction values of several health states (P301 C) ········ 138

Figure A.1. Binary classification using SVMs ······················································· 147

- xii -

NOMENCLATURE

Abbreviations

AE Acoustic Emission

AI Artificial Intelligent

ANNs Artificial Neural Networks

ARIMA Autoregressive Integrated Moving Average

ARMA Autoregressive Moving Average

BPFI Ball Pass Frequency of Inner race

BPFO Ball Pass Frequency of Outer race

BPNN Back Propagation Neural Network

BSF Ball Spin Frequency

CART Classification and Regression Trees

CCNN Cascade Correlation Neural Network

CM Condition Monitoring

CWT Continuous Wavelet Transform

DAG Directed Acyclic Graph

DWNN Dynamic Wavelet Neural Network

DWT Discrete Wavelet Transform

EAs Evolutionary Algorithms

ESs Expert Systems

FC Frequency Centre

FFNN Feed-Forward Neural Network

FFT Fast Fourier Transform

FTF Fundamental Train Frequency

GA Genetic Algorithm

- xiii -

HFRT High Frequency Resonance Technique

HMM Hidden Markov Model

IMS Inductive Monitoring System

ISO International Standard Organization

KF Kalman Filter

LNG Liquefied Natural Gas

LOO Leave-One-Out

MCSA Motor Current Signature Analysis

MSF Mean square Frequency

NF Neuro-Fuzzy

OAA One-Against-All

OAO One-Against-One

OOB Out of Bag

PDF Probability Density Function

PHM Proportional hazards model

QP Quadratic Programming

RBF Radial Basis Function

RLE Residual Life Estimate

RMS Root Mean Square

RMSF Root Mean Square Frequency

RUL Remaining Useful Life

RVF Root Variance Frequency

SMO Sequential Minimal Optimization

SOM Self-Organizing Map

SPC Statistical Process Control

STFT Short-Time Fourier Transform

SVD Singular Value Decomposition

SVMs Support Vector Machines

TSM Tensor Space Model

VF Variance Frequency

VSM Vector Space Model

WNN Wavelet Neural Network

WPT Wavelet Packet Transform

- xiv -

Greek Letters

α Distance evaluation criteria

α Mean value of α

Regression coefficient vector

Linear regression coefficients

Sharp parameter

Scale parameter

Hazard rate

Expected life ′ Real remaining life

Slack variable

Average remaining life of training state

Kernel function

Set of surviving times

Set of failure times

Roman Abbreviations

Ball diameter

Weight factor

Penalty parameter

, Average distance of all the features in state ′

, Average distance of all the features in different states

Entropy estimation

Entropy estimation standard error

Line frequency

Pole passing frequency

Slip frequency

Histogram upper bound

Histogram lower bound

Sum of amplitude of sideband

Amplitude of the fundamental component of stator current

Number of classes (states)

- xv -

Number of observations

Number of feature and sample

Pitch diameter and number of pole

, Eigen value

Shaft rotating speed

Each health state

Severity rotor fault

Smoothed health state

Time

Width of smooth window

Weighting factor

Weight associated with neuron

Predictor variables in linear regression

Absolute value

Observations at time

Peak value

, Root mean square value

Target variable in linear regression

A series of impulses at the bearing fault frequency

Exponential decay

Noise added to corrupt the signal

Health states of number

Bearing radial load distribution

Bearing-induced resonant frequency

Health state at time

Covariate

Subscripts

: data index

: class (state) index

- xvi -

STATEMENT OF ORIGINALITY

The work contained in this thesis has not been previously submitted to meet

requirements for an award at this or any other higher education institution. To the

best of my knowledge and belief, the thesis contains no material previously published

or written by another person except where due reference is made.

- xvii -

ACKNOWLEDGEMENTS

I would like to express my gratitude to Prof. Andy Tan for his supervision, advice,

and guidance from the very early stage of this research as well as giving me

extraordinary experiences throughout the work. I am also heartily thankful to Prof.

Joseph Mathew for his encouragement, guidance and support which enable me to

complete this research work.

I also gratefully acknowledge Prof. Bo-Suk Yang and Prof. Byeong-Keun Choi

for their constructive feedback and advice through out my study. I would also thank

Dr Eric Kim for providing me with valuable advice and support in my work.

Collective and individual acknowledgments are also owed to my colleagues at

KOGAS-Tech in Korea for their generous support and providing valuable data to

validate my research.

I am particularly grateful to my parents, mother-in-law, sister, brother, brothers-in

law, my wife and daughter for their unconditional support and sacrifice. This

important milestone in my life would have been not achieved without their

unwavering love and support. I would like to show a very special appreciation to my

beloved wife for her support, love and confidence in me.

Finally, I offer my regards and blessings to my fellows and friends who supported

me in various ways during the compilation of the thesis.

Machine Prognostics Based on Health State Probability Estimation

- １ -

CHAPTER 1 INTRODUCTION

Productivity is the prime objective for manufacturing companies to stay

competitive in a continuously growing global market. Increased productivity can be

achieved through increased availability of production capability. Technological

development has resulted in increased complexity both in industrial machinery and

production systems. There is an increasing demand in the community for improved

economy, reliability, reduced environmental risks and human safety [1]. Therefore,

the importance of the maintenance function has increased because of its role in

keeping and improving system availability and safety, as well as in product quality.

The economic consequences from an unexpected stoppage in industry may be as

high as US$70 000 to US$420 000 per day (see Table 1.1).

Table 1.1 The economic consequences of one-day stoppage in industry [1]

Economic consequences of one-day stoppage in industry Nuclear Power Station US$ 420,000

Pulp and Paper Plant US$ 280,000

Steel Works, Continuous casting US$ 210,000

Chemical Factory US$ 140,000

Coal Power Station US$ 140,000

Mine US$ 140,000

Oil Refinery US$ 70,000

The costs indicated in the above emphasize the increasing importance of condition

monitoring, diagnostics and prognostics of machinery in industry. Therefore, there is

a pressing need to continuously develop and improve intelligent maintenance


- ２ -

systems in order to identify service needs, optimize maintenance actions and to avoid

unexpected production stoppages [2].

An important objective of condition-based maintenance (CBM) is to determine

the optimal time for replacement or overhaul of a machine. The ability to accurately

predict the remaining useful life (RUL) of a machine is critical for its operation and

can also be used to extend production capability and to enhance a system’s reliability.

In CBM, maintenance is usually performed based on an assessment or prediction of

the machine health instead of its service time, which leads to increased usage of the

machine, reduced down time and enhanced operation safety. An effective

prognostics program will provide sufficient lead time for maintenance engineers to

schedule a repair and to acquire replacement components before catastrophic failures

occur. Recent advances in computing and information technology have accelerated

the production capability of modern machines, and reasonable progress has been

achieved in machine fault diagnostics, but not in prognostics.

Prognostics can be defined as the ability to predict accurately and precisely the

remaining life time of a failing machine component or subsystem. A reliable

predictor is important and useful to industries to forecast the upcoming states of a

dynamic system or to predict damage propagation trend in machines. Therefore, the

forecasting information can be used to provide an accurate alarm level before a fault

reaches critical levels so as to prevent machinery performance degradation,

malfunction or catastrophic failure. It can also be used for scheduling of repairs and

predictive/preventive maintenance and predictive fault-tolerant control of

engineering assets.

Although a large variety of prognostic models have been proposed and well

reported in technical literature, an effective prognostic methodology for industrial

application has yet to be developed. Prognostics is considerably more difficult to

formulate since its accuracy is subject to stochastic processes that are yet to occur. In

general, many diagnostic engineers have advance event knowledge and experience

about machine failure and health state by continuously monitoring and analysing

machine condition in industry, but there are still no clear systematic methodologies

for how to predict accurate machine remnant life to support the decision making of

asset management. The task still relies on human expert knowledge and experience.


- ３ -

Therefore there is an urgent need to continuously develop and improve effective

asset health management systems which can be implemented in maintenance systems

for real industrial applications.

Currently a variety of prognostic methodologies have been reported in the technical

literature. However their application in industry is still relatively new and mainly

focused on the prediction of a specific component’s degradation. The current models

do not use sufficient features for the interpretation of machine degradation process.

Consequently, major challenges for accurate long-term prediction of machine

remnant life still remain to be addressed.

This research is aimed at establishing a new practical machine health estimation

method to address above mentioned research challenges. The following section will

define the research problem, boundaries limiting the scope of the investigation, and

the contribution of this research. Finally, a brief overview of this thesis is presented.


- ４ -

1.1 Problem Statement

Currently, a number of valuable prognostic models and methods have been

proposed in machine prognostics. However, an efficient prognostic methodology

with accurate long-term prediction for application in industry has yet to be developed.

Current literature shows that none of the existing prognostic models considered

different health states of the machine which effectively represent the failure

degradation process of a machine accurately.

Although condition monitoring and diagnosis technologies have advanced

recently, prognostics still do not provide systematic methodology for application in

industry because the existing models consider only specific equipment or component

degradation and not the whole machine system. Hence, research for accurate long-

term prediction methodologies needs to be explored to overcome the current

limitations of existing prognostic models.

To represent the complex nature of machine degradation effectively, an accurate

prognostics model requires a number of damage sensitive features. Existing time

series and regression model approaches are still less available to use sufficient

features that can well represent the complex nature of the degradation process in a

real environment. These models can use only one or a limited number of features to

represent the failure process for the prediction of machine remnant life.

To establish an effective health management system, performance assessment,

degradation model, failure analysis, health prediction, feature extraction and

knowledge base of faults are required. For accurate prognostics, it is essential to

conduct prior analysis of the system’s degradation process, failure patterns and event

history of the machine, as well as machine condition data.

The problem statement above confirms that there is a need for more research in

developing accurate prognostics technologies which can predict the nature of

machine degradation effectively and including accurate long-term prediction

capability.


- ５ -

1.2 Objective of Research

The objective of this research is to develop a robust prognostic model aimed at

determining the remaining useful life of failing components based on discrete

machine health state probability estimation. The methodology of machine

prognostics assumes that machine degradation consists of a series of degraded states

(health states) which is necessary when machine failure is nonlinear or in the

presence of dynamic and stochastic processes. The present work achieves this goal

by simultaneously accomplishing three specific objectives.

The first specific objective is to establish an integrated prognostics system which

includes effective multi-feature extraction and fault diagnostics, aligned with

historical (empirical) knowledge for accurate long-term prediction of the machine

remnant life. This architecture includes condition monitoring, feature extraction,

classification of impending faults, health state probability estimation and prognostics,

and is performed by linking them to case-based historical knowledge. Furthermore,

this scheme provides an accumulated historical knowledge for system updating and

further prognostic applications by providing reliable posterior degradation features.

The second specific objective of this research is the development of new

estimation methods for modelling discrete machine degradation stages using

classification algorithms for better understanding and interpretation of dynamic and

stochastic failure process. This implementation provides the severities of impending

faults and estimates the probability of current machine health state for RUL

prediction. By employing existing classification algorithms, a number of damage

sensitive features can be used to estimate current machine health state in the feature

space. The outcome of health state estimation provides an accurate real time failure

index for the prediction of machine remnant life.

The third, and last, objective for this thesis is to establish a practical diagnostic

and prognostic model whereby the information acquired through on-line condition

monitoring is transformed into a set of features that characterize the machine health

condition for fault diagnosis and prognosis. For the scalability of the proposed model,


- ６ -

diverse machine fault characteristics from different failure data will be used for

model validation in a real environment.

1.3 Scope of Research

The proposed prognostics model in this research is mainly applied to rotating

machine components and rolling element bearings in particular. This is because

rotating machinery is common and critical equipment in many industries, and

bearing failure is often the cause of machine breakdowns. This work also looks at

other component failures in rotating machinery, such as rotor bar failure and impeller

rubbing failure for model scalability.

For effective implementation of CBM, techniques such as signal processing,

feature extraction, fault diagnostics and prognostics have been extensively studied in

this research for the development of integrated diagnostic and prognostic models. In

the real environment, machine failures do not follow a monotonous process; they are

normally associated with multiple phenomena generated from other component or

system failures, depending on machine systems. Consequently, accurate RUL

prediction capability requires advanced sensors, damage sensitive features, incipient

fault detection and isolation techniques for adequate prognostic state awareness.

Therefore, an integrated prognostics system should include effective feature

extraction and fault diagnostics, including historical (empirical) knowledge for

accurate long-term prediction of the machine remnant life.

In the proposed model, fault diagnostics (isolation) and health state probability

estimation are performed based on the abilities of classification algorithms. A

number of classifiers and pattern recognition techniques are explored to determine

appropriate classifiers, such as Neural Networks (NNs), Support Vector Machines

(SVMs), Classification and Regression Trees (CART), and others. To deal with high-

dimensional data, effective feature selection techniques are employed in this research

for the best possible prediction of RUL. Historical (empirical) knowledge will also

be used to provide qualitative understanding of the discrete machine degradation

stages and training data sets for the estimation of discrete health state probability.


- ７ -

To validate the proposed prognostic model in a timely manner, bearing failure

data will be simulated and experimental tests will be conducted using the bearing-

run-to-failure test rig to facilitate accelerated bearing life. From the bearing failure

data, a number of features will be calculated, trained and tested for the validation of

the proposed model.

Since the primary research goal is the development of a practical prognostic

model, real life condition monitoring data and maintenance events of actual pumps in

industry will be analysed extensively and then employed for the model validation in

a real environment.

1.4 Originality and Contribution

This thesis presents a novel approach that can be used in asset health management

system for fault diagnostics and prognostics of machine failure. The principal

significance and contribution of the work include:

Integration of fault diagnostics and prognostics for accurate prediction of

machine remnant life

For accurate prediction of RUL, the proposed prognostic model has a closed loop

architecture consisting of an integrated diagnostics and prognostics system based on

health state probability estimation, with embedded historical knowledge for accurate

long-term prediction of the machine remnant life. Through the integrated system

with fault diagnostics, a more precise failure pattern from a number of empirical

degradation data stored as historical knowledge can be employed in the prognostics

model. The accumulated historical knowledge can then be used for system updating

and for improving the prognostics model by providing reliable posterior degradation

characteristics for diverse failure modes and fault types. Furthermore, this scheme

provides the guideline for the integration of the machine diagnosis and prognosis

architectures which is aimed at determining the remaining useful life of failing

components.


- ８ -

Methodology to estimate the probability of machine health state in real time

A novel methodology for machine health state estimation by applying discrete

degradation process of machine failure is presented in this research. None of the

current prognostic models have considered using discrete health state probability,

which can effectively represent the dynamic and stochastic degradation of machine

failure. To compare with other existing prognostics approaches, the proposed model

not only provides accurate long-range prediction of machine remnant life, but also

enables a sufficient usage of a range of condition indicators to effectively represent

the complex nature of machine degradation by using the ability of classification

algorithms in health state probability estimation. Furthermore, this full utilization of

a range of features can lead to a generic and scalable prognostic model for practical

application in industry.

Comparative study of machine fault diagnostics using progressive fault data

A comparative study of five different classifiers was performed using progressive

fault data from three machine fault cases. Although many intelligent fault diagnostic

models have been validated using a number of fault data, none of them consider

different severity levels in fault propagation to estimate the fault diagnostic

performance. The result of a comparison test shows that the fault classification

accuracy is variable and depends on the severity of machine fault and on the type of

classifier. Through this comparative study, an appropriate classification algorithm is

employed in heath state probability estimation in this research.

Model validation through four case studies using simulated, experimental and

real industry data

A number of case studies, from simulation tests through to industry applications,

were conducted to validate the feasibility of the proposed model. The scalability of

the proposed model was validated by using different types of fault in real case

studies. The optimum number of health states for a machine failure is also

investigated to minimise the training error of health state estimation without

significant decrease in the prediction accuracy.


- ９ -

Model comparison using Proportional Hazards Model

Through the model comparison study using PHM, it is verified that the proposed

prognostic model based on health state probability estimation can provide a more

accurate prediction capability than the commonly used PHM in the case of dynamic

and stochastic process of machine degradation.

Publications of research outcome

Several publications have been generated as part of the research work hereby

discussed. The results presented in the case studies of simulation, experiment and

industry case studies have been disclosed to the public in the following publications:

1. Hack-Eun Kim, Andy C. C. Tan, Joseph Mathew, Eric Y. H. Kim and Byeong-Keun Choi, 2010 “Machine Prognostics based on health state estimation using SVM”, Journal of Engineering Asset Management (Accepted 15 June, 2010).

2. Hack-Eun Kim, Andy C. C. Tan and Joseph Mathew, 2010 “New machine prognostics approach based on health state probability estimation” in Proceedings of 6th Australasian Congress on Applied Mechanics, ACAM 6, Perth, Australia.

3. Hack-Eun Kim, Andy C. C. Tan and Joseph Mathew, 2010 “Integrated approach for HP-LNG pump diagnostics and prognostics based on health state probability estimation” in Proceedings of the 5th World Congress on Engineering Asset Management (WCEAM-ICF/IQ-AGIC), Brisbane, Australia.

4. Hack-Eun Kim, Andy C. C. Tan, Joseph Mathew, Eric Y. H. Kim and Byeong-Keun Choi, 2009 “Prognosis of bearing failure based on health state estimation” in Proceedings of the 4th World Congress on Engineering Asset Management, Athens, Greece.

5. Hack-Eun Kim, Andy C. C. Tan, Joseph Mathew, Eric Y. H. Kim and Byeong-Keun Choi, 2009 “Integrated Diagnosis and Prognosis Model for High Pressure LNG Pump” in Proceedings of 13th Asia-Pacific Vibration Conference, Christchurch, New Zealand.


- １０ -

6. Yifan Zhou, Lin Ma, Rodney C. Wolff and Hack-Eun Kim, 2009 “Asset life prediction using multiple degradation indicators and lifetime data: a Gamma-based state space model approach” in Proceedings of the 8th International Conference on Reliability, Maintainability and Safety, Chengdu, China.

7. H. E. Kim, A. C. C. Tan, J. Mathew, E. Y. H. Kim and B. K. Choi, 2008 “Machine Prognostics Based on Health State Estimation Using SVM”, in Proceedings of the World Congress on Engineering Asset Management, Beijing, China.

8. D. S. Gu, S. W. Cho, J. H. Lee, H. E. Kim and B. K. Choi, “Redesign of Cryogenic Pump in Liquefied Natural Gas Storage Tank Considering Thermal Effect” Journal of Computational and Theoretical Nanoscience, vol. 5, pp. 1534-1538, 2008.

9. H. E. Kim, B. G. Choi, H. J. Kim, H.E Jeong, D. S. Gu, 2007 “Vibration diagnosis case of primary LNG pumps”, in Proceedings of the World Congress on Engineering Asset Management, Harrogate, UK.

10. D. S. Gu, J. H. Lee, H. E. Kim and B. K. Choi, 2007 “Abnormal Vibration Diagnosis caused by Design Failure of Cryogenic Low-Pressure LNG Pump” in Proceedings of Korean Society for Noise and Vibration Engineering Autumn Annual Meeting.

11. Hack-Eun Kim, Andy C.C. Tan, Joseph Mathew and Byeong-Keun Choi, 2010 “Bearing fault prognosis based on health state probability estimation”, Journal of Expert Systems with Applications (Under review).

12. Hack-Eun Kim, Andy C.C. Tan, Joseph Mathew and Bo-Suk Yang, 2010, “Integrated approach for diagnosis and prognosis of HP-LNG pump based on health state probability estimation”, Journal of Sound and Vibration (In preparation).

1.5 Organisation of the Thesis

This thesis is composed of seven chapters. The subtopics contained in each chapter

are described as follow:


- １１ -

Chapter 1 introduces a brief overview and the scope of the research area. This

chapter also presents the objective, significance and innovation of this research. It

shows how the research objective has grown out of the unresolved problem identified

in current research. The originality and principal contribution of this work are also

presented.

Chapter 2 presents a comprehensive literature review on current condition

monitoring techniques, diagnostics and prognostics approaches. First, the

background information of machine maintenance strategy is reviewed to show how it

has evolved to the present state. Then, the overviews of key techniques for the

effective implement of CBM strategy are explored in Section 2.2. Section 2.3

describes the existing signal processing techniques as a fundamental step prior to

fault diagnostics and prognostics. Current research on machine fault diagnostics and

prognostics are reviewed respectively in Sections 2.4 and 2.5. Finally, unresolved

current research issues and remaining challenges for machine diagnostics and

prognostics in real industrial applications are summarised in Section 2.6. The

following four chapters present the research contribution to fulfil the remaining

challenges derived from the research review.

Chapter 3 describes the development of the prognostic model proposed by the

candidate to address the unresolved issues identified in chapter 2. Section 3.1

introduces the proposed prognostic system which is integrated with diagnostics and

based on health state probability estimation. Three key elements in the proposed

system, historical knowledge, diagnostics, health state estimation and prognostics are

detailed in Sections 3.2, 3.3 3.4 and 3.5 respectively. The methodology of the health

state probability estimation and remnant life prediction using SVM classifiers is

presented in this chapter.

Chapter 4 presents a comparative study on intelligent fault diagnostics using five

different classifiers to investigate appropriate classifiers to be employed in the

proposed prognostic model. Section 4.1 describes the High-Pressure Liquefied

Natural Gas (HP-LNG) Pump as an object of this diagnostics test. The historical

maintenance event and failure data analysis are presented in Section 4.2. The feature

selection method and comparison test results are presented in the remaining sections

which includes a brief description of the five classifiers employed.


- １２ -

In Chapter 5, the proposed model is validated using simulation data of progressive

bearing failure and experimental bearing run-to-failure data. Section 5.1 describes

the bearing fault simulation methodology and the model validation using simulated

bearing failure data. Section 5.2 describes the designed experimental test rig for

accelerated bearing failure test and how these experimental data are used for

validating the prognostic model including prediction results. The model comparison

with the Proportional Hazards Model (PHM) using identical experimental data is

presented in Section 5.3.

Chapter 6 presents the validation of the proposed model through two industry case

studies. To verify the applicability of the proposed model in a real environment,

these model validations are conducted using two different failure data from HP-LNG

pumps. Section 6.1 presents the prognostics of impeller rubbing failure. In this case

study, two sets of impeller-rub data are analysed and employed to predict the

remnant life of the pump based on estimation of health state probability using the

SVM classifier. In Section 6.2, the second case study is conducted using two data

sets of bearing failure. The optimal number of health states of bearing failure is also

investigated through comparison tests of a range of health states.

The last part of the thesis, in Chapter 7, presents conclusions and future work to

improve the proposed model for real application in industry.

“Machine Prognostics Based on Health State Probability Estimation”

- １３ -

CHAPTER 2 RESEARCH BACKGROUND AND LITERATURE REVIEW

This Chapter presents the research background and current technologies in

machine diagnostics and prognostics used in condition-based maintenance (CBM)

and it is divided into five sections. Section 2.1 covers the historical aspects and

evolution of maintenance strategies. The overviews of key techniques for the

effective implement of CBM strategy are explored in Section 2.2. Section 2.3

describes existing signal processing techniques as a fundamental step prior to fault

diagnostics and prognostics. In Sections 2.4 and 2.5, current research on machine

fault diagnostics and prognostics in focus throughout the thesis are reviewed

respectively. Section 2.6 summarises current challenges on machine prognostics for

real industrial application.

2.1 Historical Maintenance Strategies and Philosophies

Machinery is a critical asset for business success in the fiercely competitive global

economy. Recent advancement in technology has resulted in improvements to

machinery so that output, productivity and efficiency have increased rapidly.

Maintenance is a combination of all technical, administrative and managerial actions

during the life cycle of an item intended to keep a machine or restore it to a state in

which it can perform the required function [3]. Previously, maintenance has been

considered as an expense account with performance measures developed to track

direct costs or surrogates such as the headcount of tradesmen and the total duration

of forced outages during a specified period. However, this recognition has been


- １４ -

changed. Nowadays, maintenance is acknowledged as a major contributor to the

performance and profitability of business organizations [4]. Therefore today

maintenance is confronted with a wide range of challenges that include quality

improvement, reduced lead times, set up time and cost reductions, capacity

expansion, managing complex technology and innovation, improving the reliability

of systems, and related environmental issues [5]. A good maintenance policy not

only prevents system failures, but leads to maximum capacity utilization, improved

product quality, customer satisfaction and adequate equipment life span, among other

benefits.

Maintenance philosophies can be broadly classified as reactive and proactive.

Figure 2.1 shows the taxonomy of maintenance philosophies. The earliest and

conventional maintenance strategies consist of break-down (or collective) and

preventive maintenance. In break-down maintenance, a machine is fixed when it fails

[6]. The advantage of this strategy is that no analysis or planning is required.

However, one of the problems with this strategy includes the occurrence of

unexpected downtime at times that may be inconvenient, and preventing

accomplishment of committed production schedules.

Figure 2.1 Taxonomy of maintenance philosophies [6]


- １５ -

Proactive or planned maintenance can be further classified as preventive and

predictive maintenance. As the name suggests it does not wait for the equipment to

fail before commencing the maintenance operations. In preventive maintenance,

components are replaced based on a conservative schedule to “prevent” commonly

occurring failures. Although preventive maintenance programs increase system

availability, they can be expensive because of frequent replacement of costly parts

before the end of their life. Another disadvantage of preventive maintenance is that it

is time-based and is not related to the age of the machine. Moreover, this strategy is

neither incorporated into the design of the system, nor is the impact of maintenance

on system and business performance duly recognised.

Since the 1970s, a more integrated approach to maintenance evolved in both the

government and private sectors. Maintenance cost was considered a significant

component through the life cycle costing approach in new costly defence acquisitions.

The close connection between “reliability” and “maintainability” was recognised in

so called reliability centred maintenance (RCM). RCM has been developed for the

aircraft industry sector. For aircraft and other safety-related applications, cost-

effectiveness is balanced with safety and availability, with the goal of minimizing

cost and downtime by eliminating the chance of a failure [6]. In RCM strategy,

maintenance is carried out at the component level and the maintenance effort for a

component is a function of the reliability of the component and the consequence of

its failure under normal operation. This approach uses failure mode effects analysis

(FMEA) and utilizes reliability estimates of the system to formulate a cost-effective

schedule for maintenance [7]. RCM views maintenance in the broader business

context and takes into account the link between component failures and their impact

on the business performance. However, this approach only assumes a normal

operating condition and the optimal maintenance strategies do not consider the load

on the equipment and its effect on the degradation process in real life.

To minimize both maintenance and repair costs and have maintenance based on

probability of failure requires ongoing assessment of machine health, prediction of

failures based on current health, operation and maintenance history. It is known as

predictive maintenance. Therefore, predictive maintenance directly monitors the


- １６ -

operating condition, efficiency and other indicators of critical components in the

machine to determine the mean-time-to-failure or cost of efficiency.

Condition-based maintenance (CBM) is a method used to reduce the uncertainty

of maintenance activities and is carried out according to the need indicated by the

equipment condition [8]. CBM assumes that existing indicative prognostic

parameters can be detected and used to quantify possible failure of equipment before

it actually occurs. Prognostic parameters provide the indication of potential problems

and incipient faults which would cause the equipment or component to deviate from

the acceptable performance level. The conditions of a system are quantified by

parameters that are continuously monitored. Some of the advantages of CBM include

prior warning of impending failure and increased precision in failure prediction. It

also aids in diagnostic procedures as it is relatively easy to associate the failure to

specific components through the monitored parameters. To develop solutions for

CBM effectively and efficiently will require a wide-ranging effort to coordinate all

levels of management, from engineers to project officers to program managers to top

corporate level.

2.2 Key Aspects for Effective Implementation of CBM

A complete CBM system is composed of a number of functional capabilities:

sensing and data acquisition, data manipulation, condition monitoring, health

assessment/diagnostics, prognostics and decision reasoning. In addition, some form

of human system interface is required to provide user access to the system and

provide a means of displaying vital information. Currently, in order to develop and

encourage the adoption of open information standards for operations and

maintenance in industry, the Machinery Information Management Open Standards

Alliance (MIMOSA) provides the standardized architecture for a CBM system called

Open Systems Architecture for Condition-Based Maintenance (OSA-CBM) [9]. The

OSA-CBM system must be broken down into generalized components or functions.

This architecture has been described in terms of functional layers: from sensing and

data acquisition to decision support. The general functions of the layers are specified

below:


- １７ -

Layer 1 – Data Acquisition: The data acquisition module has been generalized to

represent the software module that provides system access to digitized sensors or

transducer data. The data acquisition module is basically a server of calibrated

digitized sensor data records.

Layer 2 – Data Manipulation: The data manipulation module may perform single

and/or multi-channel signal transformations along with specialized CBM feature

extraction algorithms.

Layer 3 – Condition Monitor: The primary function of the condition monitor is to

compare features against expected values or operational limits and output

enumerated condition indicators (e.g. level low, level normal, level high, etc.)

Layer 4 – Health Assessment: The primary function of the health assessment

layer is to determine if the health of a monitored system, subsystem or piece of

equipment is degraded. The health assessment module should take into account

trends in the health history, operational status and loading, and the maintenance

history.

Layer 5 – Prognostics: The primary function of the prognostics layer is to project

the current health state of equipment into the future or estimate the remaining useful

life (RUL) taking into account estimates of future usage profiles.

Layer 6 – Decision Support: The primary function of the decision support layer is

to provide recommendations related to maintenance action schedules and

modification of the equipment configuration or mission profiles in order to

accomplish mission objectives. The decision support module needs to take into

account operational history (including usage and maintenance), current and future

mission profiles, high- level unit objectives and resource constraints.

Layer 7 – Human Interface (Presentation Layer): Typically high- level status

(health assessments, prognostic assessments or decision support recommendations)

and alerts would be displayed at this layer, with the ability to drill down to multiple

layers of access depending on the information needs of the user.


- １８ -

The above seven layers can also be simplified into three key steps in a CBM

program as depicted in Figure 2.2.

Figure 2.2 Condition Based Maintenance Process [10]

Data acquisition is a fundamental step for machinery condition monitoring,

diagnostics and prognostics. In this step, useful condition indicators (data) are

collected and stored from targeted physical assets in a CBM program. In the second

step, the obtained information is handled and analyzed for better understanding and

for interpretation of the data, including the validation of sensor signals and feature

extraction. Finally, this program recommends maintenance actions based on outputs

of fault diagnostics and prognostics. The following definitions are used in this thesis:

“diagnostics” are the processes of detection and isolation of faults or failures, and

“prognostics” are the processes of predicting a future state based on current and

historical conditions, or estimating the remaining useful life (RUL) of components or

systems. In the following sections, existing data processing, fault diagnostics and

prognostics techniques are described and reviewed as they are key elements for an

effective CBM program.

2.3 Existing Data Processing Techniques

In order to collect useful data from targeted physical assets, diverse condition

monitoring techniques are used in real environments. Condition monitoring data can

be vibration, acoustic, oil, temperature, pressure, moisture or environment data.

Many different types of sensors, combined with signal processing technologies, have

Data Acquisition

Data Processing

Decision-Making

Machine health information is collected and stored

Information obtained is handled and analyzed

Appropriate maintenance actions are recommended


- １９ -

been invented and presented in research papers, but only a few have found their way

to industrial application [11]. Maintenance data systems, such as computerised

maintenance management systems (CMMS), have been designed for data storage and

analysis. O'Donoghue and Prendergast [12] concluded that CMMS have benefited a

textile manufacturing company by reducing the cost of spares, improving uptime,

increasing equipment availability, reducing lead times, increasing morale, reducing

unscheduled maintenance and streamlining work orders schedules. Godot et al. [13]

also reported that the use of CMMS leads to an improved system of maintenance.

Raw data acquired from sensors are pre-processed before being used for further

analysis. Special attention is given to waveform type data as they require more

processing strategies and a variety of techniques have been developed for their

analysis and interpretation. Errors caused by background noise, human factors and

sensor faults need to be eliminated and appropriate features need to be calculated,

selected and/or extracted for further diagnosis and prognosis. Tan and Mathew [14]

described the application of adaptive noise cancellation (ANC) and blind

deconvolution (BD) techniques to detect a bearing fault when the signals are

contaminated by noise. Xu and Kwan [15] demonstrated that sensor fault isolation is

the solution for data errors caused by sensor faults. After data “cleaning”, various

signal processing techniques have been developed to analyse and interpret waveform

data to extract useful information for further diagnostic and prognostic purposes.

Generally waveform data analysis techniques fall into three categories: time-domain

techniques, frequency-domain techniques, and time-frequency techniques.

2.3.1 Time-domain techniques

Time-domain techniques are based on statistically distinctive behaviours of

the time waveform signals. The simplest time-domain analysis calculates the

signals’ overall root-mean-square (RMS) level and crest factor. Other commonly

used characteristic features are peak, peak-to-peak amplitude, standard deviation,

skewness, kurtosis and time synchronous average.

The features described here are called statistical features because they are

based only on the distribution of signal samples with the time series treated as a

random variable. These features were also known as moments or cumulants. In


- ２０ -

most cases, the probability density function (pdf) can be decomposed into its

constituent moments. A change in condition causes a change in the probability

density function of the signal. Hence, the moments may also change. Therefore,

monitoring this phenomenon can provide useful diagnostic information.

The moment coefficients of time waveform data can be calculated using the

following equations,

∑ (2.1)

where represents the expected value of the function, is the ith time

historical data and N is the number of data points.

The first four cumulants: mean, standard deviation, skewness and kurtosis, can

be calculated from the first four moments using the following relationships

Mean = (2.2)

Standard Deviation = (2.3)

Skewness = 3 2 (2.4)

Kurtosis = 3 4 12 6 (2.5)

In addition, non-dimensional feature parameters in the time domain, such as

shape factor and crest factor are popularly used.

Shape Factor = / (2.6)

Crest Factor = / (2.7)

where , and are root mean square value, absolute value and

peak value, respectively.

Histograms which can be thought as a discrete probability density function

(pdf) are calculated in the following way. Let d be the number of divisions that

are needed to divide the range into, let with 0 ≤ i ≤ d be the columns of the


- ２１ -

histogram, then

∑ , , 0 (2.8)

1, 0,

(2.9)

The histogram upper bound (hU) and lower bound histogram (hL) are defined

as,

Δ/2 (2.10)

Δ/2 (2.11)

where Δ / 1

Effectively, it is normalized by two parameters: the length of the sequence and

the column divisions. Since the sum term above includes a 1/ term, and every

must fall into exactly one column, the net effect is that 1

0, … , 1 . The column divisions are relative to the bounding box, and thus

most of above will not be zero. This is desirable, since it essentially removes

the issue of size of a sign, and low resolution on small signs, with lots of empty

columns. The alternative would be to have absolute locations which are nowhere

nearly as closely correlated with the information in the sign itself.

In information theory, uncertainty can be measured by entropy. The entropy of

distribution is the amount of a randomness of the distribution. Entropy estimation

is a two stage process; first, a histogram is estimated, and then the entropy is

calculated. The entropy estimation and standard error are

defined as

∑ P ln (2.12)

∑ P ln (2.13)

where is discrete time signals, P is the distribution of the whole


- ２２ -

signal. Here, the entropy of vibration and current signals are estimated using

unbiased estimated approach.

More sophisticated time-domain approaches apply time series models to

signals. The main idea of time series modeling is to fit the waveform data to a

parametric time series model and extract features based on this parametric model

[16]. The autoregressive (AR) model and the autoregressive moving average

(ARMA) model are among the most favoured time series modeling techniques.

An ARMA model of order p,q, can be expressed by,

(2.14)

where is the waveform signal, ’s are independent variable normally

distributed with mean 0 and constant variance , and and are model

coefficients.

Poyhonen et al. [17] applied the AR model to vibration signals collected from

an induction motor and used the AR model coefficients as extracted features.

Zhan and Jardine [18] used adaptive AR models to process non-stationary

vibration signals and found that they are able to provide reliable time-frequency

domain information for condition monitoring.

Baillie and Mathew [19] compared three different AR models, and reported

that the Back Propagation Neural Networks (BNNs) outperformed the radial

basis functions and the conventional linear autoregressive models. They

compared their performance and reliability under the conditions of various signal

lengths from a rolling element bearing. Also, the BNN technique required much

shorter data length. Salami and Sidek [20] examined the effect of sampling

conditions, noise level, number of components and relative sizes of the signal

parameters on the performance of an ARMA model. Simulation results show that

high-resolution estimates of decay constants can be obtained when the signal

processing technique is used to analyse signals with varied signal-to-noise ratios

(SNRs). Unfortunately, the time-domain approach alone is often incapable of

identifying the faulty component and is therefore insufficient to diagnose the bulk

of machine problems.


- ２３ -

2.3.2 Frequency-domain techniques

Frequency-domain techniques are based on the fact that a localized defect

generates a periodic signal with a unique characteristic frequency [21].

Frequency-domain analysis methods are able to overcome the shortcomings of

time-domain analysis mentioned in the previous section as it can easily identify

and isolate other frequency components. It is probably the most widely used

approach for bearing fault detection.

When using frequency domain parameters as indicator of faults, the primary

diagnosis is available through fast decomposing a complex signal into simpler

parts. Changes in the frequency-domain parameters indicates occurrence of faults

because different faults have different spectrum in frequency-domain. Frequency-

domain parameters can be also used for early detection of machine faults and

failures. Therefore, these indices can be used to perform condition monitoring,

fault diagnostics and prognostics.

A conventional frequency-domain technique is spectrum analysis using Fast

Fourier Transform (FFT). To enhance the results of spectrum analysis, frequency

filters, demodulation, side band structure analysis and graphical presentation are

often used. Different types of frequency spectra such as power spectrum,

cepstrum and high-order spectrum have been developed.

The power spectrum shows power distribution with frequency. For a given

signal, the power spectrum gives a plot of the portion of a signal's power (energy

per unit time) falling within given frequency bins. The most common way of

generating a power spectrum is by using a discrete Fourier transform, but other

techniques such as the maximum entropy method can also be used. The following

parameters in frequency domain are commonly used as fault indicators for

diagnostics and prognostics.

(2.15) Frequency Center FC


- ２４ -

Mean Square Frequency MSF

Variance Frequency VF

(2.16)

Root Mean Square Frequency RMSF √MSF (2.17)

(2.18)

Root Variance Frequency RVF √VF (2.19)

where is the signal power spectrum. The FC, MSF and RMSF show

the change of position of main frequencies. The VF and RVF describe the

convergence of the spectrum power.

High-order spectrum (bispectrum or trispectrum) is able to extract more

diagnostic information than power spectrum for non-Gaussian signals [16].

Kocur and Stanko [22] proposed the order bispectrum and claimed that it enables

the elimination of smearing and modulation which often arises in the

conventional power spectrum and bispectrum. Order bispectrum techniques are

based on the signal processing of the order domain signal, where the signal

sampling is in accordance with the roll angle of a reciprocating machine shaft.

The enveloping technique is used for the purpose of enhancing small signals.

This method first separates higher frequency signals from low frequency machine

vibrations by band pass filtering. One of the measurement problems in detecting

fault signal is the ability to detect small amplitude signal. A defect signal in the

time domain is very narrow, resulting in an energy component spread over a wide

frequency range; consequently the harmonic amplitudes of the defect frequency

are buried in noise.

Averaging technique in frequency-domain analysis can be divided into two

types: synchronous averaging and spectrum averaging. Synchronous averaging is

very useful in reducing the random noise component in the measurement, or in

reducing the effect of other interfering signals such as noise components from

nearby machine. A tachometer is required to synchronize each snapshot of the


- ２５ -

signal to the running speed of machine. Unlike synchronous averaging, spectrum

averaging does not reduce the noise. Instead, it finds the average magnitude at

each frequency where a series of individual spectra are added together and the

sum is divided by the number of spectra.

Cepstrum has the capability to detect harmonics and sideband patterns in

power spectrum. The application of the power cepstrum to machine fault

detection is based on the ability to detect the periodicity in the spectrum such as

family of the uniformly spaced harmonics and side bands while being insensitive

to the transmission path of the signal from an internal source to an external

measurement point. The value of the main cepstrum peak is shown to be an

excellent trend parameter. It represents the average over a large number of

individual harmonics and fluctuations for example as a result of load variations.

The largely averaged cepstrum value gives a smooth trend curve with time. Kim

and Lyon [23] presented examples of detection of excitation pulses using the

cepstrum.

The high frequency resonance technique (HFRT) takes advantage of the fact

that most of the signal’s energy generated by a defect is concentrated in the high

frequency resonance range, and it can provide envelope signals with high signal-

to-noise ratio (SNR) which are associated with the periodicity of a defective

bearing signal. An adaptive noise-cancellation method has also been developed to

enhance the envelope spectrum obtained by HFRT [24]

However, the frequency-domain approach, like all other techniques, is not

without its shortcoming. It does not perform well when it comes to non-stationary

waveform signals which are very common with defective machines.

2.3.3 Time-frequency techniques

Time-frequency techniques investigate waveform signals in both time and

frequency domains. Therefore it addresses the problem encountered in

frequency-domain analysis when the signals are non-stationary.

The conventional time-frequency technique uses both time and frequency


- ２６ -

distributions, which represent the energy of signals in two-dimensional functions,

namely time and frequency to better reveal fault patterns [16]. The most widely

used time-frequency distributions are Short-Time Fourier Transform (STFT) and

Wigner-Ville distribution.

Another time-frequency technique is the wavelet transform. It was developed

to overcome the short-coming of the STFT, which can also be used to analyze

non-stationary signals. While STFT gives a constant resolution at all frequencies,

the wavelet transform uses multi-resolution technique by which different

frequencies are analyzed with different resolutions. The wavelet transform

decomposes a concerned signal into a linear combination of time scale units. It

analyzes original signals and organizes them into several signal components

according to the translation of the mother wavelet or wavelet basis function

which changes the scale and shows the transition of each frequency component.

Recently, wavelet transform techniques have been successfully employed in

machine fault diagnostics such as gear [25], bearing [26] and internal combustion

engine [27]. It can produce high frequency resolution at low frequencies and high

time resolution at high frequencies and can also reduce noise in raw signals.

The continuous wavelet transform (CWT) is an integration with respect to the

total time of the product of the target signal f(t) and the mother wavelet ba,ψ .

Using mathematical expression, the continuous wavelet transform of the time

function f(t) can be written as

∫∞

∞−= dttfbaCWT ba,)(),( ψ

(2.20)

⎟⎠⎞

⎜⎝⎛ −

=a

btaba ψψ 1

, (2.21)

where ba, and ba,ψ are the scale, translation parameters and mother wavelet,

respectively.

CWT provides powerful multi-resolution in time–frequency analysis for

characterizing the transitory features of non-stationary signals. CWTs can


- ２７ -

decompose an inspected signal into a family of elementary functions. This ability

renders the analysis of the inspected signal easier for machine operators. The

comparison study on the effectiveness of the popular envelope detection and

CWTs on the fault diagnosis of roller bearings can be seen in [28]. This study

shows that CWTs outperform the envelope detection in identifying the causes of

faults at early stage.

Discrete wavelet transform (DWT), which is based on sub-band coding is

found to yield a fast computation of wavelet transform. It is easy to implement

and reduces the computation time and resources required. The orthogonal basis

functions used in wavelet analysis are families of scaling function, φ(t) and

associated wavelet ψ(t). The scaling function can be represented by following

mathematical expression

∑ −=k

jkkj ktHt )2()(, φφ

(2.22)

where Hk represents coefficient of scaling function, k, j represent translation

and scale, respectively. Similarly, the associated wavelet can be generated using

the same coefficient as the scaling function

)2(2)1()( 1, ktht jk

k

kkj −−= −∑ φψ

(2.23)

The scaling function is orthogonal to each other as well as with the wavelet

function as shown in Eqs. (2.22) and (2.23). This fact is crucial and forms part of

the framework for multi-resolution analysis.

0)12()2( =−−∫∞

∞−

dtktk φφ (2.24)

∫∞

∞−

= 0)()( dttt φψ (2.25)

Using an iterative method, the scaling function and associated wavelet can be

computed if the coefficients are known.


- ２８ -

A signal can be decomposed into approximate coefficients aj,k through the

inner product of the original signal at scale j and the scaling function.

∫∞

∞−

= dtttfa kjjkj )()( ,, φ (2.26)

)2(2)( 2/, ktt jjkj −= −− φφ (2.27)

Similarly detail coefficients dj,k can be obtained through the inner product of

the signal and the complex conjugate of the wavelet function.

∫∞

∞−

= dtttfd kjjkj )()( ,, ψ (2.28)

)2(2)( 2/, ktt jjkj −= −− φψ (2.29)

The original signal can therefore be decomposed at different scales as follows

∑ ∑ ∑∞

−∞= −∞=

∞

−∞=

+=j

j

j kkjkjkjkj tdatf

0

00)()( ,,,, ψφ (2.30)

∑−

=

=1

0,, )(][

N

kkjkj tanf φ )()( ),1(

1

0),1(),1(

1

0),1( tdta kj

N

kkjkj

N

kkj +

−

=++

−

=+ ∑∑ += ψφ

(2.31)

The coefficient of the next decomposition level (j+1) can be expressed as:

∑ ∫=

++ =N

kkjtkjkjkj dttaa

0),1()(,,),1( )(φφ

(2.32)

∑ ∫=

++ =N

kkjtkjkjkj dttad

0),1()(,,),1( )(ψφ

(2.33)

∑=+k

kjkj kgaa ][,),1( and

∑=+k

kjkj khad ][,),1( (2.34)

The decomposition coefficients can be determined through convolution and


- ２９ -

implemented by using a filter. The filter g is a low-pass filter and h is a

high-pass filter.

∑=

−=N

kknxkhny

1][][][ (2.35)

The comparison study on the effectiveness and reliability of wavelet transform

to other vibration signal analysis techniques can be seen in [29].

2.4 Existing Methods for Fault Diagnostics

In all maintenance programs, condition monitoring and fault diagnostics play an

important role. With the advancement of signal processing techniques, condition

monitoring is becoming popular in industry because of its efficient role in detecting

potential failures. A principal objective of fault diagnostics is to detect whether a

specific fault is present or not based on the available condition monitoring data

without intrusive inspection of the machine.

There are several ways of classifying approaches to the problem of diagnosing an

engineering system. Diagnostic techniques can be classified into two approaches,

depending on whether the diagnosis assessment is based on deterministic information

or on stochastic information (e.g., historical, statistical parameters). The first of these

two has been termed a “white box” approach, while the second is known as “black

box” approach. Park et al. [30] also suggest that a combination of these two

techniques known as a “gray box” approach. In this thesis, the existing solutions to

the problems of performing diagnostics and prognostics are classified into two

approaches: data-driven approaches and model-based approaches, although other

classifications exist.

2.4.1 Data-driven Approaches

Data-driven approaches include signal processing algorithms and knowledge-

based methodologies. Data-driven techniques rely on comparative assessments of

the status of a system under testing with other known occurrences. For as long as

the behavior of the system under testing remains similar to that of a previously


- ３０ -

known, healthy configuration, the former is deemed to be healthy. When the

measured behavior deviates from this reference, a fault is detected, and a

comparison with the conditions previously observed in analogous faulted systems

can take place. Under the appropriate conditions, this new comparison has the

potential to isolate and identify the fault efficiently. Thus, the ability of data-

driven approaches to perform the task of diagnostics is given by the training of a

classification algorithm.

The training algorithms used by data-driven decision processes are highly

automated tasks for which extensive literature exists. Intelligent algorithms in

support of this duty are generally straightforward to implement. A more

appealing characteristic is the fact that data-driven effort typically avoids the

need to understand the underlying physical mechanisms that describe the

behavior of a system; and diagnostics are performed regardless of the causes of a

fault. Furthermore, data-driven algorithms have the ability to “learn” as they

operate, ideally making their assessments more reliable with each fault detection

attempt.

Jardine et al. [10] proposed that the existing methods of data-driven

diagnostics can be further grouped into two basic approaches namely, statistical

approaches and artificial intelligent (AI) approaches, depending on the employed

algorithms. In the following subsections, two data-driven approaches are

reviewed.

2.4.1.1 Statistical Approaches

In early fault diagnostics methods, statistic tests were constructed to

summarize the condition monitoring information so as to be able to decide

whether to accept or reject some hypothesis of machine condition [31].

Recently, a framework for fault diagnostics known as the structured

hypothesis test was proposed for efficient handling of complicated

multiple faults of different types [32].

As one of the multivariate statistical analysis methods, cluster analysis

is a statistical classification approach that groups signals into different

fault categories on the basis of similarity of certain characteristics or


- ３１ -

features they possess. It seeks to minimize within-group variance and

maximize between-group variance. The result of cluster analysis is a

number of heterogeneous groups with homogeneous contents. Even

though there are substantial differences between the groups, the signals

within a single group are similar.

Iverson’s Inductive Monitoring System (IMS) [33] uses a clustering

algorithm to cluster the nominal training data into clusters representing

different modes of the system for fault detection. When the new data fails

to fit into any of the clusters, it signals an anomaly, using the distance from

the nearest cluster as a measure of the strength of the anomaly. It was

trained using data from five previous Space Shuttle flights, and then tested

using STS-107 space shuttle data. It detected an anomaly in data from

temperature sensors on the shuttle’s left wing shortly after the foam impact,

suggesting in retrospect that with the aid of IMS, flight controllers might

have been able to detect the damage to the wing much sooner than they did.

Some other applications using cluster analysis in machinery fault

diagnostics can be found in [34, 35].

A common method of signal grouping is based on distance measures or

similarity measures between two signals (features). These measures are

usually derived from certain discriminant functions in statistical pattern

recognition [36]. There are several distance measures such as Euclidean

distance, Mahalanobis distance, Kullback–Leibler distance and Bayesian

distance. Some examples of using these distance metrics for fault

diagnostics are presented in [37-40].

As a similarity measure, feature vector correlation coefficients are also

commonly used for machinery fault diagnosis [38]. A commonly used

algorithm in machine fault diagnostics is the nearest neighbour algorithm

that fuses two closest groups into a new group and calculates distance

between two groups as the distance of the nearest neighbour in the two

separate groups [41]. The boundary of two adjacent groups is determined

by the discriminant function used. A piecewise linear discriminant function

was used and thus piecewise linear boundaries were obtained for bearing


- ３２ -

condition classification [42].

Other similarity measures used in machine diagnostics include Support

Vector Machines (SVMs) which optimizes a boundary curve in the sense

that the distance of the closest point to the boundary curve is maximized.

Recently, SVMs have emerged as popular machine learning method due to

its excellent generalization ability as compared to the traditional methods

such as neural network. SVMs have been successfully applied in a number

of diagnostic applications, ranging from bearing faults [43], induction

motor [44], machine tools [45, 46] to rotating machines [47]. In feature

based fault diagnostics, SVMs are commonly combined with other feature

selection techniques and kernel functions such as linear, polynomial and

radial basis function (RBF) kernel. He and Shi [48] found that SVMs

produced better accuracy than artificial neural networks when applied to

the diagnostics of faults in valves of reciprocating pumps using vibration

data. They used a wavelet packet transform (WPT) to preprocess the

vibration data, extracting the time and frequency information and then

used the SVMs to classify the faults. Nambura et al. [49] presented fault

severity estimation using SVMs for mode-invariant fault diagnostics of

automotive engines.

Statistical process control (SPC) which was originally developed based

on quality control theory is also employed in machine fault detection and

diagnostics. The principle of SPC is to measure the deviation of the current

signal from a reference signal which represents the normal condition, to

see whether the current signal is within the control limit or not. Fugate et

al. presented that a statistically significant number of error terms outside

the control limits indicate a system transit from a healthy state to a damage

state in their application of SPC for damage detection [50].

Much work has been done with diagnosing problems in helicopter

gearboxes based on vibration data [51, 52]. Their work focused on the

preprocessing algorithms that extract statistical features from the data that

can be used for diagnosis. The feature extraction algorithms are used to

extract features from new data, which can then be compared with features


- ３３ -

extracted from known nominal data and features extracted from data with

various known failures in order to form fault diagnostics.

2.4.1.2 Artificial Intelligence (AI) Approaches

Artificial Intelligence (AI) approaches applied to pattern recognition

technique has been successively used in machine diagnostics. However, it

is not easy to apply appropriate AI techniques due to the lack of efficient

procedures to obtain training data and specific knowledge of the faults,

which are required in training of the models [10]. Popular AI techniques

for machine diagnostics are artificial neural networks (ANNs), expert

systems (ESs), fuzzy logic systems (FLSs), fuzzy–neural networks (FNNs)

and evolutionary algorithms (EAs).

The most popular AI approaches to diagnostics are the ANNs which are

used to model engineering systems. An ANN is a computational model that

mimics the human brain structure. It consists of simple processing

elements connected to a complex layer structure which enables the model

to approximate a complex non-linear function using multi-input and multi-

output features. A processing element comprises of a node and a weight.

The ANN learns the unknown function by adjusting its weights with

observations of input and output. This process is usually called training of

an ANN. There are various neural network models. Feed-forward neural

network (FFNN) structure is the most widely used neural network

structure in machine fault diagnostics [53-56]. The FFNN, multilayer

perceptron using the back-propagation (BP) training algorithm is the most

commonly used neural network model for pattern recognition,

classification and in machine fault diagnostics [57, 58].

Spoerre [59] applied cascade correlation neural network (CCNN) to

bearing fault classification and showed that CCNN can result in utilizing

the minimum network structure for fault recognition with satisfactory

accuracy. CCNN does not require initial determination of the network

structure and the number of nodes. CCNN can be used in cases where on-

line training is preferable.


- ３４ -

Other neural network models applied in machine diagnostics are radial

basis function neural networks (RBF-NNs), recurrent neural networks

(RNNs) [60, 61] and counter propagation neural networks (CPNNs) [62].

The above ANN models usually used supervised learning algorithms

which required external input such as the a priori knowledge of the target

or desired output. For example, a common practice in training of a neural

network model is to use a set of experimental data with known (seeded)

faults. This training process is called supervised learning.

Compared to supervised learning, unsupervised learning does not

require an external input. An unsupervised neural network learns itself

using new information available. Wang and Too [63] used the

unsupervised neural networks, self-organizing map (SOM) and learning

vector quantization to rotating machine fault detection. Tallam et al. [64]

proposed a self-commissioning and on-line training algorithm for FFNN

with particular application to electric machine fault diagnostics. Sohn et al.

[65] used an auto associative neural network on the extracted features to

separate the effect of damage features from those caused by the

environmental and vibration variations of the system. A sequential

probability ratio test was then performed on the normalized features for

damage classification. Schwabacher [66] used two unsupervised anomaly

detection algorithms for rocket engines. These algorithms support both

discrete and continuous variables and it used to detect anomalies in the

relationships among the variables in addition to anomalies in the individual

variables. The algorithms detected some anomalies that were already

known to the experts and some others that were not known to the experts.

Oza et al. used neural nets and ensembles of neural nets for fault

detection [67]. Their method of detecting a fault is to assume that a fault

has occurred when an actual maneuver fails to match a predicted maneuver.

The data they used include vibration data from gearbox, angular velocity

and torque of planetary gear, altitude, velocity and orientation of the

helicopter from a set of experimental flights in which the pilot always

performed a predetermined maneuver. They obtained very high accuracy at


- ３５ -

predicting the maneuver, especially when using ensemble methods.

In contrast to neural networks, which learn knowledge by training on

observed set of data with known input and output values, expert systems

(ESs) utilize domain expert knowledge with an automated inference engine

to perform reasoning for problem solving. Three main reasoning methods

in the area of machinery diagnostics are rule-based reasoning, case-based

reasoning and model-based reasoning. An alternative reasoning method,

known as negative reasoning, was introduced to machine diagnostics by

Hall et al. [68]. Stanek et al. [69] compared case-based and model-based

reasoning methods and proposed the application of their lower-cost

solution to machine condition assessment and diagnosis. Unlike other

reasoning methods, negative reasoning deals with negative information,

which by its absence or lack of symptoms is indicative of meaningful

inferences.

ESs and NNs have their own limitations. One main limitation of rule-

based ESs is the combinatorial explosion, which refers to the computation

problem when the number rule increases exponentially as the number of

variables increases. The other limitation is consistency maintenance, which

refers to the process by which the system decides what variables need to be

recomputed in response to changes.

BEAM (Beacon-based Exception Analysis for Multi-Missions) system

was applied to anomaly detection in space shuttle engine by Park et al.

[70]. BEAM has nine components and nine different approaches to

anomaly detection. In their work, Dynamical Invariant Anomaly Detector

(DIAD) was used as an unsupervised anomaly detection algorithm, which

looked for anomalies in one variable at a time. They trained the DIAD

using data from 16 nominal tests and tested it using data from seven tests

that contained known failures. It detected all of the major failures in these

seven tests; however it missed some minor failures and had some false

alarms due to the high anomaly threshold from large variability in the

sensor data during training.


- ３６ -

Some of machine degradation processes can be ideally described by a

mathematical model known as the Hidden Markov Model (HMM).

Srivastava [71] presents algorithms based on envelope detection and

dynamic Hidden Markov Models for detecting anomalies in time series

data with large numbers of discrete and continuous variables. He tested the

algorithms by using synthetic data from a fleet of aircraft. Bakhtazad et al.

[72] used the HMM combined with wavelets to detect abnormal behavior

in plant operation. In their work, wavelets were used to generate features,

and HMM were used for classification. Smyth [73] also modeled the

normal and failure modes as states in the HMM. He defined the transition

probabilities between these modes for fault diagnostics.

2.4.2 Model-Based Approaches

Model-based approaches more commonly involved the description of a system

through mathematical models of the physical laws governing its behavior.

Compared to the data-driven approach, the model-based approach is generally

more robust in the sense that it can sort out new or unforeseen situations more

easily, since the technique can incorporate and replicate according to its

mathematical models. If the state of a system deviates from expected operational

ranges, model-based techniques can continue to work by updating the physical

parameters that describe the new situation. Because of this adaptive ability, the

model-based approach can omit the use of the extensive training and historical

information required by the data-driven approach. The approach is also less

prone to the difficulties introduced by under- or over-training.

Much work has been done in the field of aircraft systems based on model-

based diagnostics. Williams et al. [74] addressed model-based diagnostics for

spacecraft engines. They used a hierarchical model of components and modules.

In their model, each component is modeled using a finite state machine. Aaseng

et al. [75] used TEAMS, which is a commercial product from Qualtech Systems

Inc., to build a prototype ground-based diagnostic system for a portion of the

power distribution system on the International Space Station (ISS). The prototype

they built was model-based and included fault detection and diagnostics.


- ３７ -

A number of model-based approaches have also been applied to fault

diagnosis of a range of rotating machine components faults such as gears [76, 77],

bearings [78, 79] and rotor shafts [80, 81]. However, most of the applications in

these literatures used experimental data for model training and validation.

Howard et al. [76] used the effect of variations in gear tooth torsional mesh

stiffness and finite element analysis for modeling of gear faults. Baillie and

Mathew [78] employed multi-layer artificial neural networks to construct the

nonlinear autoregressive models for each class of the time domain vibration

signals in bearing fault diagnostics. Their model shows that model-based system

can provide an alternative machine fault diagnostic technique where real-time

processing of limited amounts of data is required. Sekhar [81] modeled rotor

crack using finite element method, while the cracks are considered through local

flexibility changes. The cracks have been identified for their depths and locations

on the shaft.

2.4.3 Comparison of data-driven and model-based approaches

Data-driven techniques can be ineffective when dealing with measurements

that deviate from the references available in the training “library,” whether there

is damage or not. If the behavior of a system is dissimilar to all the past

observations from healthy and faulty cases that were available at the time of

training, there may not indicate what the data-driven algorithm is going to decide.

If this is a recurring situation, or if some change has made the deviations

permanent, the algorithm can continue to misinterpret the system status until re-

training is performed. Changes like this might be due to “under-training”, since

the new situations must be added to the training library. On the other hand, there

also exists a danger of “over-training”, which occurs when all of the training data

is similar and the algorithms adjust too finely to specific details of the data that

are more of a coincidence than have a causal relationship with the fault. Thus,

over-training, also termed as over-fitting, has the opposite effect of what training

intends, instead of making data-driven classification algorithms more effective, it

makes them less “prepared” to deal with changes in the data sets. Designers of

data-driven algorithms must always take care to balance the algorithm

implementations so neither under- nor over-training take place.


- ３８ -

Compared to data-driven approaches, model-based approaches require much

effort and expertise to increase the model’s reliability and applicability to the real

situation by simplifying assumptions of machine fault or failure mechanism. This

effort is typically beyond that required by data-driven techniques. All the

observed occurrences of a fault in past instances become useless to the modeling

effort if the physics behind such behavior is not well understood.

A comparison graph of the applicability of the data-driven and the model-

driven approaches is presented by Inman et al. [82] as shown in Figure 2.3.

Figure 2.3 Comparison of the data-driven approach and the model-based

approach[82].

Although data-based techniques may be able to indicate a change in the presence of

new loading conditions or system configuration, they will perform poorly when

trying to classify the nature of the change. Thus, it is common to use the results from

a physics-based model to ‘train’ a data-based technique to recognize fault cases for

which no experimental data exist. Typically the balance between physics-based

models and data-based techniques will depend on the amount of relevant data

available and the level of confidence in the physics-based models, as illustrated in

Figure 2.3.


- 39 -

2.5 Current Prognostics Approaches

Prognostics can be defined as the ability to predict accurately and precisely the

remaining useful lifetime (RUL) of a failing machine component or subsystem, and

is also a branch of maintenance decision-making. International Standard

Organization (ISO) 13381-1 [83] defines prognostics as “an estimation of time to

failure and risk for one or more existing and future failure modes”.

Prognosis determines whether a fault is impending and estimates how soon and

likely a fault will occur. Today’s advances in condition-based maintenance have

contributed to some progress of machine prognostics. This breakthrough not only

reduces maintenance costs but increases operation efficiency and reduces human

casualties. Current prognostic methods aim to predict the RUL of a faulting machine

and to predict the probability of a failure at some future time. The prognostic

methods can also be classified as being associated with one or more of the following

three approaches: data-driven approaches, model-based approaches and reliability-

based approaches. Each of these approaches has its own advantages and

disadvantages, and consequently they are often used in combination in many

applications to overcome the individual limitations.

2.5.1 Data-driven Approaches for Prognostics

The data-driven approaches are derived directly from routinely monitored

system operating data (e.g., calibration, calorimetric data, spectrometric data,

power, vibration and acoustic signal, temperature, pressure, oil debris, currents

and voltages). In many applications, measured input/output data are major

sources for gaining a deeper understanding of the system degradation behavior.

The data-driven approaches rely on the assumption that the statistical

characteristics of data are relatively consistent unless a malfunctioning event

occurs in the system. They are built based on historical records and produce

prediction output base on condition monitoring (CM) data. The data-driven

approaches are based on time series analysis techniques and machine-learning

techniques for prognostics. In the following subsections, the data-driven


- 40 -

approaches are classified into two approaches: Time Series Analysis approaches

and Artificial Intelligence (AI) approaches.

2.5.1.1 Time Series Analysis Approaches

If sufficient amounts of time-dependent data are available, time series

analysis techniques are often used to determine the state or functional

value of the systems, at a future point in time. These techniques rely

heavily on past data to predict future performance. In some cases where

multiple data sets exist, it is feasible to use statistical techniques or

simulation which can provide an estimate of the failure time distribution.

The following subsection describes some common time series analysis

techniques and applications in literature are reviewed.

Regression Techniques

This sub-section provides an overview of the concepts and techniques

associated with regression analysis. Regression analysis uses the existing

data and determines the relationships, if any, between the measurable

outcome and the variables contributing to that outcome (e.g. life

expectancy is the outcome and exercise and diet are the variables

contributing to that outcome). Neter, et al. [84] presented the framework

on the statistical relation for the prediction of machine remnant life. A

general linear regression model is given by,

, , ··· , , 1, … , (2.39)

where is a random variable denoting the value of the ith trial's

response, , , …, are estimated parameters, , , , , …,

, are the values of the predictor, or contributing variables and is

the random error with mean = 0, variance = , and covariance = 0.

Regression analysis seeks to estimate the parameters of the regression

function, , , …, , in order to find a representative model by

using the method of least squares. The method of least squares defines a

variable , where


- 41 -

∑ , , , (2.40)

and attempts to find estimates for , , …, , denoted by ,

, …, , which minimize Q for the observations ( , ), ( ,

),…, ( , ). The simultaneous solution to the equations formed by

taking the derivative of with respect to , , …, , provides

the least squares estimates, , , …, . Least squares estimates

are desired because they are unbiased and have minimum variance

resulting in

(2.41)

The method of maximum likelihood can also be used to estimate , ,

…, if the probability distribution of the error terms is known.

Li et al. [85] examined an adaptive prognostics approach where a future

bearing defect size was calculated at time Δ Δ 0 given the bearing

running condition and defect size at time t. This adaptive algorithm, based

on a recursive least squares algorithm applied to a defect power law-based

propagation model, was then employed to account for the time-varying

behavior and used to predict future impending failures.

Yan et al. [86] addressed a logistic regression model to calculate the

probability of a failure for given condition variables, and an autoregressive

moving average (ARMA) time series model to trend the condition

variables for failure prediction.

Neter et al. [84] provide much more detail on linear and nonlinear

regression models, as well as the dangers of extrapolating beyond the

observed data in their text. However, regression is not the only state

estimation method used for prognosis. Some of the methods are described

below.


- 42 -

Autoregressive Integrated Moving Average (ARIMA)

The time series or autoregressive integrated moving average (ARIMA)

is a common state estimation technique used in prognostics. It is also

known as trend analysis. ARIMA model is a generic construct which

incorporates autoregressive processes, moving average processes, and a

capability to account for non-stationary data. Given , an AR process

of order is mathematically defined as

(2.42)

which is similar to Equation (2.39) and can be rewritten as

(2.43)

where is the mean of the time series data; , , … , are time

ordered observations, , , … , are the unknown parameters of an

autoregressive process; is the white noise and is the backshift

operator respectively. Observed data provides estimates

for , , , … , . An moving average process of order is defined as

(2.44)

and can be rewritten as

(2.45)

where the observed data provides estimates for , , , … , .

A non-stationary model must be transformed into a stationary model

before the autoregressive and/or moving average techniques are applied.

This transformation normally occurs through differencing, , but

can also be accomplished by taking the logarithm of the time series data.

Therefore, a complete ARIMA model of order ( ; ; ) is mathematically

defined as

(2.46)


- 43 -

This model can describe both stationary and non-stationary time series

but requires a significant amount of data to estimate , , , … , and

, , … , .

Jardim-Goncalves et al. [87] used ARIMA models to predict when

computerized numeric control (CNC) lathe and mill machines would fail.

These machines were monitored with sound, vibration, and power

consumption sensors in real time and the authors were able to forecast

whether the machines required maintenance in future time periods given

acceptable ranges on the monitored parameters.

Patankar and Ray [88] examined the fatigue crack growth prediction

problem with a forecasting model in ductile alloys under variable-

amplitude loading. The developed forecasting model was shown to be

adequate for real time applications such as health monitoring and life

extending control.

Wang [89] used an autoregressive (AR) process to model vibration

signals for prognostics. The health condition of the gear is diagnosed by

characterizing the error signal between the filtered and unfiltered signals

using both numerical simulation and experimental data. However, the AR

parameters (polynomial coefficients) have no physical meaning related to

the monitored system. Zhang [90] proposed a parameter estimation

approach for a nonlinear model using temperature measurements of gas

turbines. The on-line detection procedure presented in his work, can track

small variations in parameters change to provide early warning.

Lu and Meeker [91] reviewed nonlinear regression models and formed a

two-stage method to estimate the model parameters. Stage 1 parameter

estimates were obtained from each degradation path. These estimates were

then transformed, if necessary, to ensure the parameter estimates came

from a multivariate normal distribution. All Stage 1 estimates were then

combined to determine estimates of the mean, variance, and covariance

which were then utilized to find the lifetime distribution.


- 44 -

Chan and Meeker [92] incorporated time series modeling to estimate the

degradation probability distribution for solar reflector material at a given

point in time and the lifetime probability distribution. The degradation was

modeled with an autoregressive (AR) process using predicted daily

degradation based on data recorded from previous years. In their work, the

Monte Carlo simulation provided numerous sample paths which were used

to form empirical distribution functions for the degradation and lifetime

distributions.

2.5.1.2 Artificial Intelligence (AI) Approaches

Artificial neural networks (ANNs), genetic algorithms, fuzzy logic and

other learning techniques constitute a class of approach known as artificial

intelligent (AI) approaches. These techniques have the ability to learn

using past history and subsequently attempt to predict the state or outcome

given a new set of input data. Hence, these techniques are the most

frequently used in current prognostic procedures. Some of AI techniques

and their application for prognostics are reviewed in this section.

One of the most popular machine-learning approaches for prognostics is

to use ANNs to model the system. ANNs are a type of (typically non-

linear) model that establishes a set of interconnected functional

relationships between input stimuli and desired output where the

parameters of the functional relationship need to be adjusted for optimal

performance.

ANNs work is similar to actual neurons found in the human brain. Each

neuron has dendrites which are the input paths, a soma which processes

the inputs and an axon which is the output path. An ANN is formed by a

collection of these artificial neurons. ANNs architecture of three layers is

illustrated in Figure 2.4. This network normally has an input layer, one or

more hidden layers and an output layer. Each layer, with the exception of

the input layer, consists of a number of neurons. Weights are designated

for each input from the input layer to neurons in the hidden layer. Given

the transfer function results from the hidden layer, another set of weights is

applied as

from the ou

Fig

ANN tec

many rese

propagation

was used to

method of

the authors

time analys

in addition

Shao an

remaining

by combin

model simp

ARMA mo

remaining b

A dynam

transform s

the remain

first trained

“Machine P

inputs to t

utput layer f

ure 2.4 Illus

chniques in

archers. Li

n neural net

o decrease t

numerically

s stated that

sis of fatigu

to predictin

nd Nezu [

life (PPRL)

ning linear

plifies the p

odels. It can

bearing life

mic wavele

sensor data

ning useful

d using vib

Prognostics

- 45

the output

form the ov

stration of a

n remaining

i and Ray [

tworks to pr

the comput

y solving no

t this neura

ue damage

ng remainin

[94] propos

) which use

and non-li

prediction pr

n be used to

e.

et neural n

to the time

life time of

ration data

Based on He

5 -

layer. The

verall output

artificial neu

g useful life

[93] examin

redict fatigu

ational time

onlinear dif

l network c

models and

ng service li

sed a prog

ed a compo

inear techn

rocess and h

determine

network (D

evolution o

f a bearing

of defectiv

ealth State P

final transf

ts of the AN

ural network

estimation

ned the uti

ue damage.

e required f

fferential eq

could possib

d other type

ife.

gression ba

ound model

niques. It w

has better ac

the upper a

DWNN) wa

of a fault pa

[95]. The D

ve bearings

Probability Es

fer function

NN.

ks architect

have been

ility of usin

Their meth

for the conv

quations. In

bly be used

es of failure

ased predic

of neural n

was shown

ccuracy tha

and lower bo

as impleme

attern for pr

DWNN mo

with varyin

stimation”

n results

ture

used by

ng back-

hodology

ventional

addition,

d for real

e models

ction of

networks

that the

an that of

ounds of

ented to

redicting

odel was

ng depth


- 46 -

and width of cracks, and was then used to predict the crack evolution until

final failure.

Gebraeel et al. [96] developed two classes of neural network models for

estimating bearing failure time during its service life. They used both

single-bearing and clustered-bearing models. Each class was modeled

using three different weight calculation techniques: weight application to

failure times (WAFT), weight application to exponential parameters

(WAEP), and weight application to exponential parameters—parameter

updating (WAEP-UP). The results showed that 92% of the failure time

predictions computed using validation bearings were within 20% of the

actual bearing life.

Huang et al. [97] believe that for highly-accelerated single-row bearings,

it is neither pragmatic nor useful to model the prediction process on the

whole life of a bearing due to the high dispersion of bearing life. They

integrated the extraction of degradation indicator-based self-organizing

map (SOM) with back propagation neural network (BPNN) based residual

life prediction. The degradation indicator is produced by using three time

and three frequency features and the NN undergo unsupervised learning.

Wang and Vachtsevanos [98] proposed a multi-input multi-output

dynamic wavelet neural network (DWNN) which incorporates temporal

information and storage capacity into its functionality so that it can predict

into the future, carrying out fault classification and prognostic tasks. A

wavelet neural network (WNN) works as the virtual sensor and DWNN

functions as a predictor. A dynamic or recurrent neural network allows

signals to flow backwards in a feedback sense. Combining reinforcement

learning with genetic algorithm (GA) allows the algorithm to interact with

the environment to improve its performance and to update only when

necessary.

Recent work on forecasting of nonlinear, non-stationary and non-

Gaussian-type time series also indicates that recurrent neural networks

(RNNs) have a better forecasting performance than other well-known


- 47 -

algorithms such as the feed-forward neural networks (FFNNs) [99, 100].

Dong et al. [101] stated that the frequently used fuzzy-based methods in

equipment criticality analysis are very fussy and not accurate, and require

many subject data. In their paper, a grey model and a back propagation

neural network (BPNN) were applied to the feed water pump subsystem

and the results show that the method is feasible and effective for

application in power plants.

Ramesh et al. [102] presented a hybrid SVM-Bayesian Network (BN)

for predicting thermal error in machine tools. In their research, SVM-BN

was first developed to classify all the errors into groups depending on the

operating conditions and then performed a mapping of the temperature

profile with the measured error. This concept leads to a more generalized

prediction model than the conventional method of directly mapping error

and temperature irrespective of condition. Such a model is especially

useful in production environments where the machine tools are subjected a

variety of operating conditions.

Another popular AI technique that is used for prognostics is the fuzzy

logic technique. Fuzzy logic provides a language (with syntax and local

semantics) into which one can translate qualitative knowledge about the

problem to be solved. In particular, fuzzy logic allows the use of linguistic

variables to model dynamic systems. These variables take fuzzy values

that are characterized by a sentence and a membership function. The

meaning of a linguistic variable may be interpreted as an elastic constraint

on its value. These constraints are propagated by fuzzy inference

operations. The resulting reasoning mechanism has powerful interpolation

properties that in turn give fuzzy logic a remarkable robustness with

respect to variations in the system's parameters and disturbances.

When applied to prognostics, fuzzy logic is typically applied in

conjunction with a machine learning method and is used to deal with some

of the uncertainties that all prognostics estimates face. Byington et al. [103,

104] employed fuzzy logic technique in order to produce an accurate,


- 48 -

reliable assessment of system health. For the development of automatic

health state estimation, they used fuzzy logic to represent the degrees of

severity or degradation.

The most prominent paper reviewed is a study on NF prognostic system

presented by Wang et al. [105]. Both RNNs and neuro-fuzzy (NF)

techniques were evaluated using sunspot benchmark and on-line gear test

data. In the sunspot testing, NF without interpolation is less accurate than

RNNs. But NF with interpolation produces more accurate results than

RNNs and provides similar results with about ten percent (10%) of the

training epochs. As far as the on-line gear test is concerned, NF captures

the system dynamic behaviour faster and is shown to be more superior to

RNNs.

Another machine-learning approach is anomaly detection algorithms

(also known as novelty detection or outlier detection algorithms). These

algorithms learn a model of the nominal behavior of the system and then

notice when new sensor data fail to match the model, indicating an

anomaly that could be a failure precursor [106, 107].

Dechamp et al. [108] presented an overview of the integration of

artificial intelligence tools in the PROTEUS European project in e-

maintenance. A generic AI template for specifying how AI tools can be

integrated into the platform was proposed. The paper presented several

examples of AI tools in diagnosis and prognosis and concluded there is a

need of “meta-tool” that can fit itself into the generic template.

The strength of data-driven techniques is their ability to transform high-

dimensional noisy data into lower dimensional information for

diagnostic/prognostic decisions. The main drawback of data-driven

approaches is that their efficacy is highly-dependent on the quantity and

quality of system operational data.


- 49 -

2.5.2 Model-based Approaches for Prognostics

The model-based methods require an accurate mathematical model to be

developed and use residuals as features, where residuals are the outcomes of

consistency checks between the sensed measurements of a real system and the

outputs of a mathematical model. Statistical techniques are normally used to

define thresholds to detect the presence of faults. The model-based approach is

applicable in situations where accurate mathematical models can be constructed

from first principles.

Cempel et al. [109] showed that symptom models used in vibration condition

monitoring for condition recognition and prediction can in most cases be limited

to Weibull and Fréchet models.

A discrete-time, finite-state shock model can be employed for the purpose of

modeling cumulative damage to an individual component. In this basic form,

such models provide a means to compute the cumulative distribution function of

the random time required to reach a failure state. The failure state in the shock

model corresponds to a pre-specified level of cumulative damage which is

assumed to be a monotonically increasing function of time. Gottlieb [110]

provided conditions on the damage process that proves the device's lifetime

distribution. Shanthikumar and Sumita [111] analyzed a system whose failure

was caused by the occurrence of a shock greater than some pre-specified level.

Associated with their shock model was a correlated pair ( ; ) of renewal

sequences with joint distribution function:

, , , , 0,1,2, … (2.47)

Li et al. [21] presented an adaptive prognostics system to estimate bearing

defect size growth using an adaptive algorithm based on recursive least square

(RLS). In their study, it was shown that due to the lack of parameter fine tuning,

small parameter difference can results in large prediction error as the bearing

cycles increase. Qiu et al. [112] commented that bearing lifetime can be

evaluated and predicted effectively by monitoring the changes in the system


- 50 -

dynamic stiffness based on real-time vibration measurements.

Adams [113] modeled damage accumulation in a structural dynamic system as

first/second order nonlinear differential equations. Chelidze [114] modeled

degradation as a "slow-time" process, which is coupled with a "fast-time"

observable subsystem. The model was used to track battery degradation (voltage)

of a vibrating beam system.

Banjevic and Jardine [115] used discrete the Markov process to represent the

failure process, along with the covariate process for computing the remaining

useful life as a function of the current conditions. Chinman and Baruah [116]

demonstrated the ability of hidden Markov model (HMM) based clustering

methods in autonomous diagnostics and prognostics. The prognostic model is

driven by a multivariate distribution of the state transition points generated by

HMMs.

Kalman Filtering (KF) is also considered a prognosis technique by estimating

some state value at a future point in time. KF incorporates the signal embedded

with noise and forms what can be considered a sequential minimum mean square

error estimator (MMSE) of the signal. Swanson [117] proposed the use of KF to

track the dynamics of the mode frequency of vibration signals in a tensioned steel

band with a seeded crack growth.

A nonlinear stochastic model of fatigue crack dynamics for real-time

computation of time-dependent damage rate in mechanical structures has been

proposed by Ray [118]. This model configuration allows the construction of a

filter for damage state estimation and remaining service life prediction based on

an extended KF principle instead of solving the Kolmogorov forward equation.

In a later paper, these authors [119] also examined fatigue crack growth

prediction using Gauss-Markov processes which did not require solution of the

extended KF equation. However, validation of the model was limited in

experimentally-generated statistical data.

The main advantage of model-based approach is the ability to incorporate

physical understanding of the system to the model. Another advantage is that, in


- 51 -

many situations, the feature vectors are closely related to model parameters [114].

Furthermore, it can also establish a functional mapping between the drifting

parameters and the selected prognostic features. If knowledge of the system

degradation is available, the model can be adapted to increase its accuracy and to

address subtle performance problems. Consequently, it can significantly

outperform data-driven approaches. However, model-based may not be the most

practical approach since the fault type in question is often unique, varies from

component to component and is hard to be identified without interrupting the

operation.

For the most part, these analytical models provide little advantage in the area

of numerical implementation. If examples are provided, they normally assume

specific parameter values. Thus, there is no specified manner to incorporate

degradation data into these analytical models. In machine prognostics, it needs to

be focused on methods to obtain the remaining lifetime distribution from

reliability theory.

2.5.3 Reliability-Based Approaches for Prognostics

Reliability engineers rely heavily on statistics, probability theory and

reliability theory. Many engineering techniques are used in reliability engineering,

such as reliability prediction, Weibull analysis, thermal management, reliability

testing and accelerated life testing.

The conventional reliability-based approaches for prognostics can be divided

into two categories: failure-based and degradation-based. Failure-based reliability

is used to estimate the lifetime distribution and its parameters when sufficient,

complete and/or censored failure time data exists. If prior knowledge of the

lifetime distribution exists for similar components, then often the lifetime

distribution is assumed to follow the same distribution of a similar component.

Compared to failure-based reliability, degradation-based reliability focuses on

using measures of component degradation, not failure data, to assess the

remaining lifetime of a component. Degradation is also known as cumulative


- 52 -

damage. Chao [120] provided an excellent review of degradation topics which

included four sets of degradation data, the methodology used to determine shelf

lives, study of growth curve, sigmoid, degradation data collection and methods to

model the degradation process.

Proportional hazards models (PHMs) are commonly used in failure prediction

and reliability analysis. PHMs assume that hazard changes proportionately with

covariates and that the proportionality constant is the same at all times. Kumar

and Westberg [121] proposed a reliability-based approach for estimating the

optimal maintenance policy to minimise the total maintenance cost per unit time.

They used PHM model to identify the importance of monitored variables and

total time on test plot to find the optimal policy.

Gasmi et al. [122] used a proportional hazards framework to model complex

repairable systems. Tallian [123] presented a rolling bearing life prediction model

using statistical lifetime determination. Statistical models such as proportional

intensity models (PIMs) and PHM are useful tools for remaining useful life

estimation and for trending of the fault propagation process [124]. Vlok and

Claasen [125] utilised statistical residual life estimate (RLE) on roller bearings to

study changes in diagnostic measurements of vibration and lubrication levels

which can influence bearing life. RLEs are based on PIMs and mainly used for

non-repairable systems utilising historic failure data and the corresponding

diagnostic measurements. Banjevic and Jardine [126] discussed RUL estimation

for a Markov failure time process which includes a joint model of PHM and

Markov property for the covariate evolution as a special case.


- 53 -

2.6 Remaining Challenges of Prognostics for Real Industry Application

This Chapter has reviewed current technologies in fault diagnostics and

prognostics of engineering systems. Although diagnostics of machine faults is well

developed over the past decades, prognostics still faces many challenges. Diagnostics

involves the investigation and analysis of the cause or nature of a condition, whereas

prognostics calculates or predicts the future as a result of rational study and analysis

of available pertinent data.

Effective machine fault prognostic technologies can lead to elimination of

unscheduled downtime and increase machine useful life and consequently lead to

reduction of maintenance costs, as well as prevention of human casualties. To

establish a practical prognostic model which can effectively estimate failure times,

the literature review on machine fault prognostics indicates the following research

challenges in real industry application.

Accurate long-term prediction of machine remnant life

Long-term prediction of a fault evolution that may result in a failure requires a

tool to manage the inherent uncertainty. Depending on the criticality of the system or

subsystem being monitored, various levels of data, models and historical information

are required to develop and implement the desired prognostic model. Many

accomplishments have been reported but major challenges for long-term prediction

of RUL still remain to be addressed. In order to provide long-term and accurate

forecasting, an integrated prognostics system which includes full utilization of

system degradation data, a well established failure model and event history has the

potential for practical application in industry.

Data-driven approaches rely on the availability of run-to-failure data and require

performing of suitable extrapolation to the damage progression to estimate RUL.

This approach is more closely aligned with engineering reasoning but it requires the

definition of both damage and a failure criterion which is often times very difficult to

establish [127]. Indeed, uncertainty representation and management is at the core of


- 54 -

performing successful prognostics. Long-term prediction of the time to failure entails

large amounts of uncertainty that must be represented effectively and managed

efficiently. For example, as more information about past damage propagation and

about future use become available, means must be devised to narrow the uncertainty

bounds. Therefore, the development of degradation models, prior failure pattern

analysis and historical knowledge of faults are essential for accurate long-term

prediction of machine remnant life.

Most of the time series analysis approaches only provide short-term predictive

capabilities by training only recent historical data and do not consider different

health states that can effectively represent the entire degradation process. Kwan et al.

[128] stated that failure processes of mechanical systems usually consists of a series

of degraded states. This physical degradation process is a common phenomenon in

practice and can be used to estimate machine health states in a real environment.

Accurate and precise prognosis demands good probabilistic models of failure

degradation and requires statistically sufficient samples of failure data to assist in

training, validating and fine tuning the prognostic algorithms.

Sufficient usage of effective features to represent machine degradation

The machine degradation process is dynamic and stochastic, and usually consists

of a series of degraded states related to the physical condition change of a machine.

To represent this complex nature of machine degradation effectively, an accurate

prognostics model requires a number of damage sensitive features.

Existing time series and regression model approaches are still not available to use

sufficient features that can well represent the complex nature of the degradation

process in a real environment. On the other hand, these models can only use one or a

limited number of features to represent failure process for the prediction of machine

remnant life.

Compared to the progress of signal processing technology and feature extraction

techniques in intelligent machine fault diagnostics, most fault prognostics models are

still not able to use fault sensitive features for the interpretation of machine

degradation process. In order to narrow the uncertainty bounds in prognostics, it is a


- 55 -

significant challenge to design prognostics models so that various effective features

from measured data can be verified and used in conjunction with physical condition

change to estimate the current and future machine health states.

Generic and scalable prognostic model for practical application

Currently many prognostic techniques have been reported but they are strictly

limited to individual application. In other words, current prognostic methods only

considered specific component degradation such as bearing, motor or gears.

In addition, most of the developed prognostics models are only applicable in the

laboratory environment and have yet to be validated in industrial applications

because of the inherent complexity of real-life machines which hinder the practical

application of many prognostics models. In addition, insufficient historical failure

data can be an obstacle for implementation in industrial validation of many

prognostics models.

Systematic incorporation of diagnostic information and historical knowledge

for accurate prediction

The prognostics process which combines effective feature extraction and fault

diagnostics to obtain the best possible prediction on the RUL still has many

remaining challenges. In a real environment, machine failures are not monotonous

processes; they are normally associated with multiple phenomena from other

component or system failures, depending on designed systems. Therefore, accurate

RUL prediction capability requires advanced sensors, damage sensitive features,

incipient fault detection and isolation techniques for adequate prognostic state

awareness.

Vachtsevanos et. al. [129] suggests that the desire for accurate prognostics has

evolved from an increase in diagnostic capability. They strongly emphasize that

diagnostics is a prerequisite for accurate prognostics, declaring that “the task of a

prognostic module is to monitor and track the time evolution (growth) of the fault”.

This implies that the fault characteristics must have been identified prior to

attempting prognostics.


- 56 -

In particular, prior diagnostic information about an imminent fault before the

prognostic process can be used to minimise the uncertainty in interpretation of

machine degradation.

In order to assess current machine performance, a significant amount of past

knowledge of the assessed machine is required because the corresponding failure

modes must be known in advance and well-described [10]. The historical CM data

and event data include significant diagnostic information and experience about

machine failure and health states by continuously monitoring and analysing machine

condition in industry. However, well understood systematic methodologies and

supporting systems on how to manage this historical data and knowledge in

conjunction with machine fault diagnostics and prognostics still remains the

impending challenge.

Currently, several integrated frameworks for diagnostics and prognostics are

addressed in recent literature [130-133]. However, none of the current literatures

consider management of historical knowledge in an integrated diagnostic and

prognostics system, although they use historical data and empirical knowledge in

model training for assessment of fault conditions and degradation states. Therefore,

development of methods or tools for processing and interpretation of knowledge

based information which can be fused and used in conjunction with integrated

diagnostics and prognostics system is a significant challenge in machine remnant life

prediction.

Chapter 3 describes a novel integration of historical knowledge with diagnostics

and prognostics model which overcomes the remaining challenges in machine fault

prognostics identified above.


- 57 -

CHAPTER 3 MACHINE PROGNOSTICS BASED ON HEALTH STATE PROBABILITY ESTIMATION

This chapter presents a novel approach to address the identified current

prognostics challenges derived from the literature review on machine diagnostics and

prognostics in the previous chapter. Section 3.1 introduces the proposed generic

health management platform, called the integrated diagnostics and prognostics model

based on health state probability estimation. The elements of the proposed system,

historical knowledge, diagnostics and prognostics are described in the remaining

sections respectively. In particular, the methodology of the health state probability

estimation and remnant life prediction is introduced at the end of this chapter.

3.1 Closed Loop Architecture for Integrating Diagnostics and Prognostics System with Embedded Historical Knowledge

The proposed model of the closed loop architecture consists of an integrated

diagnostics and prognostics system based on health state probability estimation with

embedded historical knowledge.

Figur

with em

machine

diagnos

diagnos

provide

remnant

prognos

the esti

informa

integrat

dominan

accurac

In th

diagnos

machine

knowled

degrada

assessm

and pred

each ma

source

pattern

re 3.1 illus

mbedded his

e remnant

stics and em

stic model is

long range

t life predic

stics. The ou

imation of

ation of fa

ted system

nt fault ob

y of progno

he proposed

stics and pro

e system an

dge includ

ation proce

ment, develo

diction, feat

achine syste

of failure.

could lead t

“Machin

Figure

strates the c

storical kno

life, the p

mpirical histo

s essential f

e prediction

ction requir

utcome of th

machine h

ilure patter

of diagno

btained in

ostics in pre

d model, pri

ognostics s

nd for preci

des prior in

ess. An ef

opment of de

ture extract

em has its in

Therefore,

to more acc

ne Prognostic

-

3.1 Closed

conceptual

owledge. To

proposed p

orical know

for the over

n, this mode

res good di

he diagnost

health state

rn of the

stics and p

the diagno

edicting the

ior historica

ystem for t

ise estimatio

nformation

ffective pr

egradation m

tion and hist

nherent cha

prior analy

curate predic

cs Based on

58 -

loop progn

integration

o obtain the

rognostics

wledge. Li et

rall perform

el allows fo

iagnostic in

tics module

and system

impending

prognostics

ostic modul

remnant lif

al knowledg

the classific

on of health

of faults,

rognostics

models, fail

torical know

aracteristics

ysis of ma

ction of rem

Health State

nostic system

n of diagno

e best possi

model is

t al. [21] su

mance of a p

or integratio

nformation

provides re

m update b

fault. The

, knowledg

le can be u

fe.

ge is embed

cation of im

h state prob

failure p

program r

lure analysi

wledge of fa

that could b

chine and

mnant life.

e Probability

m.

ostics and p

ible predict

integrated

uggested tha

rognostics s

on with diag

before prog

eliable infor

by employi

erefore, by

ge of pre-d

used to im

dded in the

mpending fa

bability. The

atterns and

requires pe

is, health m

faults [132].

be used to i

knowledge

Estimation”

prognostics

tion on the

with fault

at a reliable

system. To

gnostics as

gressing to

rmation for

ng precise

y using an

determined

mprove the

integrated

aults in the

e historical

d machine

erformance

anagement

In general,

dentify the

of failure

,


- 59 -

For accurate assessment of machine health state, a significant amount of past

knowledge of the assessed machine is also required because the corresponding

failure modes must be known in advance and well-described in order to assess

current machine performance [10]. In this model, through prior analysis of the

historical data and events, major failure patterns that affect the entire life of the

machine are identified for diagnostics and prognostics. The historical knowledge

provides the key information on diagnostics and prognostics of this system such as

empirical training data for the classification of impending faults and historical failure

patterns for the estimation of current health state. Furthermore, it could also be used

to determine appropriate signal processing techniques and feature extraction

techniques for effective diagnostics and prognostics.

Figure 3.2 Flowchart of the integration of historical knowledge, diagnostic system

and prognostics system based on health state probability estimation

Figure 3.2 presents the flowchart of the integration of historical knowledge,

diagnostic system and prognostics system based on health state probability

estimation. The proposed system consists of three sub-systems, namely, historical

knowledge, diagnostics and prognostics. The entire sequence includes condition


- 60 -

monitoring, classification of impending faults, health state estimation and

prognostics, and is performed by linking them to case-based historical knowledge.

Through prior analysis of historical data, the historical knowledge provides useful

information for the selection of suitable condition monitoring techniques, such as

sensor (data) type and signal processing techniques, which are dependent on machine

fault type. In the proposed model, the feature extraction and selection techniques in

the diagnostics module are linked with the historical knowledge.

The pre-determined discrete failure degradation of the machine located in the

historical knowledge module can be used to estimate the health state of the machine

which is located in the prognostics module. The final output of the prognostics

module of certain impending faults can also be accumulated to update the historical

knowledge. This accumulated historical knowledge can then be used for system

updating and improving of the prognostics model by providing reliable posterior

degradation features for a range of failure modes and fault types.

The detail of these three modules, historical knowledge, diagnostics and

prognostics in this integrated system, are described in following sections.

3.2 Historical Knowledge

In this model, historical knowledge is closely related to machine fault diagnostics

and prognostics as depicted in Figure 3.2. More specifically, prior analysis of

historical data and failure pattern in terms of historical knowledge provides key

references for fault isolation of a particular fault and health/degradation state

estimation. The historical knowledge provides useful information for effective

impending fault detection and isolation. For example, past fault historical data can be

used for intelligent fault classification performance by providing the training set of

historical faults in machine. This module provides the following three types of

diagnostic/prognostic information as shown by the three branches in Figure 3.2.


- 61 -

Analysis of Historical Data and Event: Provides past failure pattern information

for the selection of appropriate signal processing and feature extraction

techniques depending on fault types and degradations.

Main Faults: Given a typical main fault data of the machine, it is possible to

determine the impending fault type that has occurred by providing the training

set for multi-classification of faults (i.e., fault detection and isolation).

Degradation Stages of Each Failure Pattern: Analysis of past condition

monitoring data provides qualitative understanding about the sequence of

discrete failure degradation stages of each failure pattern for the estimation of

the current health state of the machine.

3.3 Diagnostics

The diagnostics module follows the typical procedure of intelligent fault diagnosis

consisting of condition monitoring, signal processing, feature extraction and fault

classification. The conventional feature-based diagnostics framework is illustrated in

Fig. 3.3.

Figure 3.3 Conventional feature-based diagnostics framework

The data acquired from machines as raw data need pre-processing to condition the

data as good as possible for determining the emergent salient condition of the

machine. These data can be vibration signals, current and volt signals, sound signals,

flux signals, etc. Depending on the designed machine, different pre-processing

procedures need be used, such as filtering (high, low and band-pass), wavelet

transforms, averaging and enveloping.


- 62 -

In general, raw data acquired from sensors require signal processing to obtain

appropriate features. A range of features need to be calculated to cover the

preliminary impending faults of the machine. The features are calculated from

various domains, namely time, frequency, or time-frequency. In this way, the

information of raw data is kept as good as possible to address the analysis method.

Furthermore, the transfer and storage problem of data can be solved.

Extensive calculations of feature parameters in those domains may result in high

dimensionality of the data features. Not all of them will provide useful information

for condition analysis; sometimes some of them can even increase the difficulty of

analysis and degrade the accuracy. Therefore, reducing the dimension of data

features is necessary to remove the irrelevant and erroneous features.

Depending on the monitoring object, effective features which can significantly

represent a machine’s condition should be selected. Effective selection of features

can avoid the problem of dimensionality and high training error value which may

cause computer overload and over-fit of data training, known as the Feature

Selection Problem [134]. The goal of dimensionality reduction is to reduce high-

dimensional data samples into a low-dimensional space while preserving most of the

intrinsic information contained in the data set.

Once dimensionality reduction is carried out appropriately, compact

representation of the data for various succeeding tasks such as visualization and

classification can be utilised. An effective feature selection can lead to better

performance of the predictors, cost-effective predictors, and a better understanding of

the underlying process that generated the data [135]. Several feature selection issues

such as feature construction, feature ranking, multivariate feature selection and

feature validity assessment methods have been reviewed in recent literatures [134,

135].

The selected features are then forwarded to the fault classification system to

define the machine’s current condition. In fault classifiers, predetermined major fault

data were trained using multi-classification algorithms. Through this training of


- 63 -

major faults of the machine system, current impending faults can be isolated and

identified in the diagnostics module.

In the diagnostics module, intelligent fault diagnostics can be performed using a

range of classification algorithms, from pattern recognition techniques to AI

techniques. The five conventional classification algorithms such as ANNs, Linear

Regression (LR), Random Forest (RF) and SVMs are reviewed and compared in

Section 4.

In the intelligent fault diagnostic module, the output of this module does not

provide any information on the severity of faults; it only provides the determination

of impending faults in the machine system. However, through this verification

(isolation) of impending faults in the diagnostics module, a more precise failure

pattern from a number of historical degradation data in historical knowledge module

can be employed in the prognostics module.

3.4 Health State Estimation and RUL Prediction

After identifying the impending faults in the diagnostic module, the discrete

failure degradation stages determined in the prior historical knowledge module are

employed in the health state estimation as depicted in Figure 3.2.

The traditional condition-based diagnostics and prognostics are based on

recognizing indications of failure in the behavior of the machine failure. If signatures

describing system behavior in the presence of a given fault are available from the

historical condition data, it is possible to evaluate current machine condition by

quantitative assessment between the newly arrived signatures and historical failure

behaviors [132, 136].

Figure 3.4 illustrates the traditional similarity-based technique for fault

diagnostics and prognostics. The figure shows two health states in machine

degradation. The most recent behavior covers the transients of normal and faulty

behaviors. This methodology can provide the level of degradation and forecast

specific faulty behavior for machine diagnostics and prognostics. This method only


- 64 -

considers two health states, namely Normal and Faulty conditions. However, in real-

life situations, machine faults normally go through various health states until final

failure.

Figure 3.4 Two health states in traditional similarity-based diagnostics and

prognostics [132]

The proposed prognostic model in this research assumed that machine degradation

consists of a series of degraded states (health states) which is essential as machine

failure is nonlinear or in the presence of dynamic and stochastic process. Figure 3.5

illustrates the discrete health states of machine degradation. The discrete health states

can effectively represent the dynamic of the failure process according to the changes

of physical condition in machine degradation.

Figure 3.5 Illustration of discrete health states in machine degradation

For better understanding of the underlying fault evolution process, an effective

feature selection procedure needs to be conducted in the prognostics module. For the

estimation of a discrete machine degradation state to represent the complex nature of

machine degradation, the proposed prognostic model employed a classification


- 65 -

algorithm which uses a number of damage sensitive features for accurate long-term

prediction.

In the proposed model, the historical failure patterns are used to determine the

required number of health stages for estimation of the machine remnant life. In

estimating the number of health states from new to final failure stages, past

predetermined discrete degradation stages were trained before being used to test the

current health state. Through prior training of failure degradation stages, the current

health state can be obtained in terms of probabilities of each health state from the

classification results of each degradation stage.

The process of health state estimation consists of two steps, namely health state

classification and health state probability estimation. These two steps are presented

in the following subsections.

3.4.1 Health State Classification Using SVM Classifiers

In this proposed model, the health states classification of discrete failure

degradation stages can be performed using a range of classification algorithms

such as Neural Networks (NNs), Support Vector Machines (SVMs),

Classification and Regression Trees (CART) and others. Among the available

classifiers, SVMs show outstanding performance in the classification process as

compared with the other classifiers in recent literatures [137-140]. The

outstanding performance of SVMs is verified by a comparative study shown in

Chapter 4 using five different classification algorithms with four fault conditions

data and five different fault severity levels. The outstanding capabilities of SVM

classifiers are principally employed in this work for health state classification to

predict the remnant life of machines. An overview of SVMs classification theory

and detailed methodology of health state classification using the SVMs classifiers

are presented in following subsections.

Support vector machine (SVM) is based on statistical learning theory

introduced by Vapnik and his co-workers [141]. It is popular within the machine

learning community due to its excellent generalization ability as compared with


- 66 -

the traditional neural network. SVM has been successfully applied in a number of

applications, such as human face detection, verification and recognition of

handwritten characters, digit recognition, verification and recognition of speech

and speaker, prediction and image retrieval.

SVM is also known as maximum margin classifier with the abilities of

simultaneously minimizing the empirical classification error and maximizing the

geometric margin. Due to its excellent generalization ability, a number of

applications have been addressed with the machine learning method in the past

few years. SVM also has the potential to handle very large feature spaces,

because training of SVM uses the dimension of classified vectors which have no

influence on the performance of classification. The technique is suitable and

reliable to handle large features. In this research, a range of features extracted

from condition monitoring data are used for fault classification and estimation of

health state probability.

SVM was originally designed for binary classification, but can be effectively

applied for multiclass classification. In this research, more than two health states

(from healthy state through to failure state) are required to estimate the discrete

failure process effectively. Therefore, the health state probability estimation

using the multi-class classification strategy of SVM will be discussed in this

section. Basic theory of binary classification of SVM is introduced in Appendix 1.

Currently, SVM multi-classification can be obtained by the combination of a

number of binary classifications. Several methods have been proposed, such as

“One-Against-One’’, ‘‘One-Against-All’’, and Directed Acyclic Graph SVM

(DAGSVM).

3.4.1.1 One-Against-All (OAA) Strategy for health state estimation

OAA method is the earliest strategy in SVM multiclass classification. For

a set of given observations , , , , where is the number

of observations and is the time index. Let be a health state (class) at

time , 1,2, … , , where is the number of health states (classes).

From the information above, OAA constructs SVM models. The th SVM


- 67 -

is trained with all of the examples in the th class with positive labels and all

other examples with negative labels. Thus given training data

, , … , , , the th SVM solves the following problem:

minimize: ∑ (3.1)

subject to: 1 , if ,

1 , if ,

0, 1,2, … ,

where the training data is mapped to a higher dimensional space by

function , is kernel function, , is the th or th training

sample, and b R are the weighting factors, is the slack

variable and C is the penalty parameter.

Minimizing means that we would like to maximize 2/ , the

margin between two groups of data. When data are not linear separable, there

is a penalty term ∑ which can reduce the number of training errors.

The basic concept behind SVM is to search for a balance between the

regularization term and the training errors.

After solving (3.1), there are decision functions

(3.2)

We say is in the class which has the largest value of the decision

function

class of arg max ,…, (3.3)

In practical terms, the dual problem of (3.1) whose number of variables is

the same as the number of data in (3.1) can be solved. Hence , -variables


- 68 -

quadratic programming problems are solved.

3.4.1.2 One-Against-One (OAO) Strategy for health state estimation

Another major method is one-against-one method. For the multi-

classification of n-health states (classes), the OAO method constructs

1 /2 classifiers where each one is trained on data from two classes. For

training data from the th and the th classes, SVM solve the following

classification problem:

minimize: ∑ (3.4)

subject to: 1 , if ,

1 , if ,

0, 1,2, … ,

There are different methods for doing the future testing after all

1 /2 classifiers are constructed. After some tests, the decision is made using

the following strategy: if sign ( ) says is in the th class,

then the vote for the th class is added by one. Otherwise, the th is

increased by one. Then is predicted in the class using the largest vote. The

voting approach described above is also called the Max Win strategy [142].

3.4.1.3 Direct Acyclic Graph (DAG) Strategy for health state estimation

The third method for multi-classification of SVMs is the direct acyclic

graph SVM (DAGSVM) proposed in [143]. The training process of the DAG

method is similar to the OAO strategy by solving 1 /2 binary SVM.

However, in the testing process, it uses a rooted binary directed acyclic graph

which has 1 /2 internal nodes and n leaves. Each node is binary

SVM of th and th classes. Given test samples , starting at the root node,

the binary decision function is evaluated. Then it moves to either left or right


- 69 -

depending on the output value [144]. Therefore, this method can pass through

a path before reaching a leaf node which indicates the predicted class. An

advantage of using a DAG is that some analysis of generalization can be

established. In addition, its testing time is less than the OAO method.

Hsu and Lin [144] presented a comparison of these methods and pointed

out that the OAO method is more suitable for practical use than the other

methods. Consequently, in this research, the OAO method is employed to

perform the health state probability estimation of discrete failure degradation

stages.

3.4.2 Health State Probability Estimation

Accurate and precise prognosis demand good probabilistic models of failure

degradation and require statistically sufficient samples of failure data to assist in

training, validating and fine tuning the prognostic model. In this research, the

probabilities of each health state as a discrete failure index are used for the

prediction of machine remnant life. From the above SVM multi-classification

result ( ), we obtain the probabilities of each health state ( ) by using the

smooth window and indicator function ( ) as following:

(3.5)

01

where is the smoothed health state and is the width of the smooth

window.

In the given smooth window subset, the sum of each health state probabilities

is shown in Eq. (3.6)

Prob | , … ,


- 70 -

(3.6)

From the result of each health probability, the probability distributions of each

health state subject to time (t) can be obtained as illustrated in Figure 3.6. Figure

3.6 shows an example of probability distribution which has a simple linear

degradation process consisting of n number of discrete health states. As the

probability of one state decreases, the probability of the next state increases. At

the point of intersection there is a region of over-lap between the two health

states, which is a natural phenomenon in linear degradation process. However,

the probability distribution of failure process is complex due to the dynamic and

stochastic degradation process in a real environment.

Figure 3.6 Illustration of health state probability distributions of simple

linear degradation process

3.4.3 Prediction of Machine Remnant Life

After the estimation of the current health state in term of the probability

distribution of each state, the RUL prediction in the proposed model is performed

using two parameters such as the health state probability at a certain time t and

the historical remaining life at each trained health state. The health state

probabilities at the time t provides a real time failure index in machine failure

process for RUL prediction. The RUL prediction of the machine can be

expressed as Eq. (3.7).

Pr | , … , 1


- 71 -

(3.7)

where is the current probability of each health state at time t,

represents the historical remaining life at each trained health state and is

the number of health states.

At the end of each prognostics process, the output information is used to

update the historical knowledge for further improvement of failure analysis by

providing reliable posterior degradation characteristics for diverse failure modes

and fault types.

3.5 Summary

In this chapter, the novel approach of designing the integrated diagnostic and

prognostic system based on health state probability estimation has been presented to

address the identified current prognostics challenges derived from the literature

review on machine diagnostics and prognostics in the previous chapter.

For accurate forecasting of machine remnant life, the proposed model has closed

loop architecture in the configuration of integrated diagnostics and prognostics

system. The RUL prediction is performed based on health state probability

estimation with embedded historical knowledge for accurate long-term prediction of

the machine remnant life. Through the integrated system with fault diagnostics, a

more precise failure pattern from a number of historical degradation data in historical

knowledge can be employed in the prognostics module through the verification

(isolation) of impending faults in diagnostics. In the proposed integrated system, the

accumulated historical knowledge can then be used for system updating and

improving of the prognostics model by providing reliable posterior degradation

features for diverse failure modes and fault types. This accumulated information

provides a good guideline to solve the CM data management problems in industry.

Pr | , … , ·


- 72 -

The methodology of health state probability estimation is presented using the

SVM multi-classification algorithm because its outstanding performance is verified

in many recent literatures [137-140] and in the comparative study in the following

Chapter. The discrete failure process in machine degradation is significantly applied

in the proposed prognostic model to represent the dynamic and stochastic machine

degradation process in a real environment.

The methodology of health state estimation using classification algorithms

enables this model to use sufficient condition indicators to effectively represent the

complex nature of machine degradation. Furthermore, full utilization of a range of

features can lead to a generic and scalable prognostic model for practical application

in industry.


- 73 -

CHAPTER 4 COMPARATIVE STUDY ON FAULT DIAGNOSTICS USING MULTI-CLASSIFIERS

This chapter presents a comparative study of intelligent fault diagnostics using

five different classifiers to investigate appropriate classifiers in employing this

proposed model. In the proposed model, the diagnostics of impending faults and the

estimation of health state probability are performed using the ability of multi-

classification algorithms. Five typical classifiers which are commonly used in

intelligent fault diagnostics are investigated using four fault condition data from High

Pressure Liquefied Natural Gas (HP-LNG) pumps. Moreover, for the better selection

of appropriate classifiers in using health state estimation, these diagnostic tests were

also carried out using five different severity levels of three faults to observe the

accuracy of classification performance according to the progressive fault levels.

4.1 HP-LNG Pumps

The LNG receiving terminal receives liquefied natural gas from LNG carrier ships,

stores the liquid in special storage tanks, vaporizes the LNG and then delivers the

natural gas through distribution pipelines. The receiving terminal is designed to

deliver a specified gas rate into a distribution pipeline and to maintain a reserve

capacity of LNG. LNG takes up six hundredths of the volume of natural gas at/or

below the boiling temperature (-162℃), which is used for storage and easy

transportation.

Figure 4.1 shows the re-gasification process in an LNG receiving terminal. As

shown in Figure 4.1, the unloaded LNG from vessels is transported to ground storage


- 74 -

tanks via pipeline using cargo pumps on the LNG carrier vessel. In an LNG receiving

terminal, primary cryogenic pumps that are installed in the storage tanks which

supply the LNG to HP-LNG pumps with pressure around 8bar. The HP-LNG pumps

boost the LNG pressure to around 80bar for evaporation and delivery of the highly

compressed natural gas via a pipeline network across the nation.

Figure 4.1 Re-gasification process in LNG receiving terminal

Figure 4.2 show the HP-LNG pump schematic and vibration measuring points.

The number of HP-LNG pumps determines the amount of LNG at the receiving

terminal. The HP-LNG pumps are crucial equipment in the LNG production process

and should be maintained at optimal conditions. Therefore, vibration and noise of

HP-LNG pumps are regularly monitored and managed based on predictive

maintenance techniques. As shown in Figure 4.2, HP- LNG pumps are enclosed

within a suction vessel and mounted with a vessel top plate. Two ball bearings are

installed to support the entire dynamic load of the integrated shaft of the pump and

motor. The submerged motor and bearings are cooled and lubricated by a

predetermined portion of the LNG being pumped. For condition monitoring of

pumps, three accelerometers are installed on the pump top plate in two radial and one

axial direction.


- 75 -

Figure 4.2 Pump schematic and vibration measuring points

Table 4.1 shows the pump specifications. These HP-LNG pumps are submerged

and operate at super cooled temperatures. They are self-lubricated at both sides of the

rotor shaft using LNG. Due to the low viscous value (about 0.16cP) of LNG, the two

bearings of the HP-LNG pump are poorly lubricated and the bearings must be

specially designed.

Table 4.1 Pump and Vibration Measurement Specifications

Capacity Pressure Impeller Stage Speed Voltage Rating

241.8 m3/hr 88.7 kg/cm2. g 9 3,585 RPM 6,600V 746 kW

Upper Bearing No.

Bottom Bearing No. No. of Pole Rotor Bar

Quantity Diffuser Vane No. Current

6314 6314 2 41 EA 8 EA 84.5 A

Accelerometer Sensitivity Sampling Frequency

51.5 mV/g 8,192 Hz


- 76 -

It is very difficult to detect the cause of pump failure at an early stage because

certain bearing components can result in rapid bearing failure due to poor lubricating

conditions and high operating speed (3,600rpm). Hence, in case of abnormal

problems occurring, one would not have sufficient time to analyze the possible root

cause of pump failure. Furthermore, due to material property variations of cryogenic

pumps at super low temperatures and difficulties in measuring the vibration signals

on the submerged pump housing, there are some restrictions for diagnostics of pump

health and the study of vibration behaviour. Hence, there is a need to use the

historical knowledge of failure patterns for accurate estimation of remnant life.

To improve the reliability and maintenance optimization of LNG plants, long-

term prediction of failures is essential for the safe operation and prolonging

utilisation of the production capability.

4.2 Historical Failure Event and Data Analysis

To conduct the comparative fault diagnostic test, four years of historical condition

monitoring data and maintenance records from a total of 16 HP-LNG pumps which

have identical specifications as described previously are analyzed to determine the

main fault types of pump. The result of historical maintenance records and data

analysis is summarised in Figure 4.3. As shown in the figure, three types of faults

such as bearing fault, excessive rubbing of pump impeller and motor rotor bar fault

are considered as unscheduled maintenance. These three type faults can affect the

entire operation life time and maintenance schedules for HP-LNG pumps because

they have fewer operation availabilities than scheduled maintenances in an LNG

receiving terminal. For example, the cases of bearing failures have about half the

operation hours (4,053 hours) of the scheduled maintenances (9,420 hours). As a

result of this analysis, these three types of pump fault, such as rotor bar fault,

impeller rubbing and bearing fault, are used in the comparative fault diagnostic tests.


- 77 -

Figure 4.3 Result of historical failure event and data analysis

The vibration data collected through three accelerometers installed on the pump

top plate were used in the diagnostic tests using four conditions (three faults and a

normal condition). The characteristics of three faults in HP-LNG pump are

summarised in the following subsections.

4.2.1 Bearing Fault

The bearing fault within the pumps was confirmed by the vibration spectrum

analysis for the diagnostic tests. The four characteristic fault frequencies of the

ball bearing in the HP-LNG pumps were calculated using the following equations

[145]:

Ball Pass Frequency, Outer race: )cos1)(2

()( Φ−=PBNSHzBPFO (4.1)


- 78 -

Ball Pass Frequency, Inner race: )cos1)(2

()( Φ+=PBNSHzBPFI (4.2)

Ball Spin Frequency: )cos1)(2

()( 22

2

Φ−=PB

BPSHzBSF (4.3)

Train or Cage Frequency: )cos1)(21()( Φ−=

PBSHzFTF (4.4)

where B is the ball diameter, P the pitch diameter, N the number of balls, and

S the shaft rotation speed in Hertz. The calculated four characteristic frequencies

of the pump bearings are summarized in Table 4.2.

Table 4.2 Bearing defect frequencies of HP-LNG pump

B 25.4 mm BPFO 183.5 Hz P 110 mm BPFI 293.7 Hz N 8 EA BSF 244.6 Hz S 3580 RPM FTF 22.9 Hz

The vibration spectrum plots of five different severities of bearing faults are

presented in Figure 4.4. As shown in Figure 4.4, bearing fault components

increased over the period of the operating hours. For example, multi-harmonic

frequencies (2 BPFO, 3 BPFI and 5 BPFO) of inner, outer race defect and ball

passing frequency have increasing peak values as the pump failure progresses.

Vibration features of bearing fault may be variable depending on the locations

of faulty bearing. In this thesis, the separation of faulty bearing from vibration

signals was not considered because the upper and lower bearings of HP-LNG

pump have identical specifications and operation speed. Moreover, these two

bearings fail simultaneously, and they are not reused during the maintenance of

LNG pumps.


- 79 -

Figure 4.4 Vibration spectrum plots of five different severities of bearing fault


- 80 -

4.2.2 Rotor Bar Fault

The rotor bar problem has a high percentage share in unscheduled

maintenance. In the case of rotor bar fault, this problem is confirmed through the

vibration and motor current signature analysis (MCSA).

Figure 4.5 shows the time wave form of beat vibration generated by two

closely spaced frequencies between rotating speed (1X) and pole passing

frequency (FP). A beat vibration is the result of two closely spaced frequencies

going into and out of synchronization with one another. Maximum vibration will

result when the time waveform of one frequency comes into phase with the other

frequency. Minimum vibration occurs when waveforms of these two frequencies

line up 180º out of phase.

Figure 4.5 Time wave form of beat vibration generated by two closely spaced

frequencies between 1X and pole passing frequency

The wideband spectrum normally will show one peak pulsating up and down.

However, when we zoom into this peak (lower spectrum), it actually shows two

closely spaced peaks. The difference in these two peaks (1X – FP) is the beat

frequency which itself appears in the wideband spectrum. The beat frequency is

not commonly seen in normal frequency range measurements since it is an

inherently low frequency, usually ranging from only approximately 0.08 to 1.6

Hz. Figure 4.6 presents the true-zooming spectrum plot near 1X frequency (3.58


- 81 -

kCPM). In the true-zooming spectrum of Figure 4.6, multi side bands near 1X

with high peak values are presented, and the interval frequency between 1X and

side bands was 30cpm (0.5Hz). These side bands are originated from FP of

induction motor as shown in the formula below.

FS = FL – RPM = 60 – 59.75 = 0.25 Hz (4.5)

FP = FS × P = 0.25 Hz × 2 = 0.5 Hz (4.6)

where FS is slip frequency and P is number of pole.

In this case, the high amplitude value of 1X came from the amplitude and

frequency modulation of the two near frequency between 1X and side band

originated from FP.

Figure 4.6 True zooming spectrum plot of broken rotor bar

The rotor bar fault is also confirmed by the motor current signature analysis

(MCSA) method. Figure 4.7 shows the Fast Fourier Transform (FFT) of motor

current signal.

The electrical rotor asymmetry increases the current harmonic next to the

fundamental of the stator current frequency. Rotor faults in an induction motor

normally can be determined from the observation of the sidebands in the stator

current spectrum, in the neighborhood of both frequencies given by [146]


- 82 -

, , 1 2 , (4.7)

, , 1 2 , (4.8)

where is supply frequency, k is the (natural) order number, 1,2,3, … ,

and s represents slip. The first-order components (k = 1) are usually called the

lower side band current and the upper side band current. In the diagnosis of

broken rotor bar, the side bands reveal faults more clearly with high values of slip

[147]. A severity factor of broken rotor bar can be defined as

100 (4.9)

where is the severity rotor fault, is the sum of amplitude of

sidebands, and is the amplitude of the fundamental component of the stator

current.

Figure 4.7 Frequency spectrum of motor current signal with broken rotor bars.

Figure 4.8 shows the vibration spectrum plots of five different severities of

rotor bar fault used in this diagnostic test.


- 83 -

Figure 4.8 Vibration spectrum plots of five different severities of rotor bar fault


- 84 -

4.2.3 Excessive rubbing of impeller wear-ring

Another typical fault type of HP-LNG pumps is excessive rubbing of impeller

wear-rings cause by the rubbing of impeller wear-ring and housing. Rubs are

typically generated by contact between rotating and stationary elements of a

machine. In the case of HP-LNG pump, a slight rubbing condition between

impeller wear-ring and housing is a common phenomenon in newly rebuilt or

modified rotors in the early stage of operation. Impeller rubs usually increase the

clearances until the rub has been cleared or if not corrected, they will wear away

the internal clearances until the machine cannot continue its operation. Figure 4.9

shows images of the state of excessive wear on impeller wear-ring and housing

after disassembly of an HP-LNG pump for maintenance.

Figure 4.9 Excessive wear of impeller wear-ring and housing

Spectrum plot displays of rub conditions are characterized by distinct

frequencies that occur at multiples of a fundamental frequency. A partial

rotor/stator rub often causes a steady sub-harmonic at a frequency equal to half of

the rotational speed [148]. The excessive rub condition of an HP-LNG pump is

confirmed with an increasing of the sub-harmonic components below 1X

according to the progress of excessive wear between impeller wear-ring and

impeller-housing.

Figure 4.10 shows the vibration spectrum plots of five different severities of

impeller rubbing used in this comparative diagnostic test. As shown in Figure

4.10, the spectrum plots indicate that amplitude of sub-harmonic components are

increasing depending on the severity of rubbing.


- 85 -

Figure 4.10 Vibration spectrum plots of five different severities of impeller rubbing


- 86 -

This comparative diagnostic test was carried out using five different severity

levels of the three types of faults and a normal condition to observe the accuracy

of classification performance according to the progressive fault levels. Table 4.3

shows the acquired vibration data and features for the diagnostic test.

Table 4.3 Acquired vibration data and features for diagnostic test

Machine No Fault Type No. of

Severity LevelNo of

Sample No of

Features Sampling Frequency

P701C Bearing Fault 5 5 42 8,192 Hz

P701D Impeller Rubbing 5 5 42 8,192 Hz

P701A Rotor Bar Fault 5 5 42 8,192 Hz

P701B Normal 5 5 42 8,192 Hz

4.3 Feature Calculation and Selection

In this research, 10 statistical parameters were calculated using time domain data.

These feature parameters were mean, rms, shape factor, skewness, kurtosis, crest

factor, entropy estimation, entropy estimation error, histogram lower and histogram

upper. In addition to these parameters, four parameters (rms frequency, frequency

centre, root variance frequency and peak) in the frequency domain were also

calculated. A total of 42 features (14 parameters, 3 positions) were calculated as

shown in Table 4.4. The detailed characteristics of these features were described in

chapter 2.

Table 4.4 Statistical feature parameters and attributed label for diagnostics

Position Time Domain Parameters Frequency Domain Parameters

Acc.(A) Mean{1}, RMS{2}, Shape factor{3}, Skewness{4}, Kurtosis{5}, Crest factor{6}, Entropy estimation value{7}, Entropy estimation error{8}, Histogram upper{9} and Histogram lower{10}

RMS frequency value{11}, Frequency centre value{12}, Root variance frequency{13} and Peak value{14}

Acc.(B)

Acc.(C)


- 87 -

To address the generic and scalable diagnostics and prognostics model which is

applicable for different faults in identical machines, a range of conventional

statistical parameters from vibration signal is used to establish the model in this

research.

For outstanding performance of fault classification and reduction of

computational effort, effective features were selected using the distance evaluation

technique of feature effectiveness introduced by Knerr et al. [149, 150] as depicted

below.

The average distance ( , ) of all the features in state i can be defined as follows:

(4.10)

The average distance ( ′, ) of all the features in different states is

(4.11)

where, , = 1, 2, , , ≠ , , : eigen value, : data index, : class

index, : average, : number of feature and : number of class.

When the average distance ( , ) inside a certain class is small and the average

distance ( ′, ) between different classes is big, these averages represent that the

features are well separated among the classes. Therefore, the distance evaluation

criteria (α ) can be defined as

(4.12)

The optimal features can be selected from the original feature sets according to the

large distance evaluation criteria (α ).

α ⁄

,1

1 , ,,

,1

1 , ,,


- 88 -

In this work, a total of 42 variables were used to extract effective features from

each signal sample measured at identical accelerometer positions. The distance

evaluation criteria (α ) of 42 features in this work are shown in Figure 4.11. In order

to select effective features, a value greater than 2 of normalized distance evaluation

criterion, αα 2 was used as the threshold, where α is distance evaluation

criterion and α is mean value of α . From the results, eight features were selected

as effective features compared with the other features. The selected eight features

were Skewness, Crest factor and Kurtosis from the accelerometers A (Radial

direction), Kurtosis, Entropy estimation from the accelerometers B (Radial direction),

Crest factor, Kurtosis and Root variance frequency value from the accelerometers C

(Axial direction). These features have distance evaluation criterion (α ) values of

greater than 14 (the threshold level). They meet the large distance evaluation

criterion (α ) as compared with other features. These features could minimise the

classification training and test error of fault multi-classification.

Figure 4.11 Feature selection using distance evaluation criterion for diagnostics

4.4 Brief Description of Employed Multi-Classifiers

The following subsections will briefly describe the three classification algorithms

employed in this comparative study for fault diagnostics of a HP-LNG pump. The

two SVM classifiers employed in this work are readily available in previous chapter.


- 89 -

4.4.1 Random Forests

Random Forest (RF) was first introduced by Breiman which consists of an

ensemble (collection) of decision trees whose predictions are combined to make

the overall prediction for the forest [151, 152]. In recent years, a number of

researches have been conducted for intelligent fault diagnostics of machines

using RFs. There are several techniques that have been introduced for

constructing an ensemble of tree-type classifiers for the purpose of increasing the

performance of the task at hand, such as Adaboost, Bagging and Random Forests

[153]. Random Forests are a combination of tree classifiers such that each tree

depends on the values of a random vector sampled independently and with the

same distribution for all trees in the forest. RF is constructed by following steps:

Step 1: Take a random sample of N observations from the data set (this is

called “bagging”). Some observations will be selected more than once and others

will not be selected. On average, about two-third of the cases will be selected by

the sampling. The remaining one-third of the cases are called the “out of bag

(OOB)” cases. A new random selection of rows is performed for each tree

constructed.

Step 2: Using the cases selected in step 1, construct a decision tree. Build the

tree to the maximum size and do not prune it. As the tree is built, allow only a

subset of the total set of predictor variables to be considered as possible splitters

for each node. Select the set of predictors to be considered as a random subset of

the total set of available predictors. For example, if there are ten predictors,

choose a random five as candidate splitters. Perform a new random selection for

each split. Some predictors (possibly the best one) will not be considered for each

split, but a predictor excluded from one split may be used for another split in the

same tree.

Step 3: Repeat steps 1 and 2 for a large number of times by constructing a

forest of trees.

Step 4: To “score” a case, run the case through each tree in the forest and

record the predicted value that the case ends up in.

T

indi

sele

can

tree

Fore

4.4

T

nam

depi

T

proc

subt

neur

laye

train

poin

(rad

and

vect

Euc

appl

The generali

vidual tree

ction of fea

be improve

algorithms

ests perform

.2 Radial B

The RBF ne

mely an inpu

icted in Figu

There is one

cessing bef

tracting the

rons then fe

er has a vari

ning proces

nt with as

dius) of the

spreads are

tor of inpu

lidean dista

lies the RB

“Machin

ization erro

s within th

atures in or

ed and mad

s spend a l

m this task w

Basis Fun

eural networ

ut layer, a h

ure 4.12.

Figure 4.1

e neuron in

fore the in

e median a

eed the valu

iable numbe

ss). Each ne

many dim

RBF funct

e determine

ut values fr

ance of the

BF kernel fu

ne Prognostic

-

or in the Ran

he forest an

rder to split

de more rob

lot of time

with little co

nction Neu

rk is a feed

hidden radi

2 RBF neur

the input la

nput layer)

and dividin

ues to each

er of neuron

euron consi

ensions as

tion may be

ed by the tr

rom the in

e test case

function to

cs Based on

90 -

ndom Fores

nd their int

t each node

bust with re

choosing h

omputationa

ural Netwo

-forward ne

ial basis lay

ral network

ayer for eac

standardiz

ng by the

of the neu

ns (the optim

ists of a ra

there are

e different

raining proc

nput layer,

from the

this distanc

Health State

sts depends

ter-correlati

e, the overa

espect to no

how to spli

al effort.

orks (RBF

etwork cons

yer and an

s architectu

ch sample. T

ze the rang

inter-quarti

rons in the

mal number

adial basis f

predictor v

for each di

cess. When

a hidden n

neuron‘s c

ce using th

e Probability

on the stren

ion. Using

ll predictiv

oise. While

it at a node

F-NNs)

sisting of th

output laye

ure

The input n

ge of the

ile range.

hidden lay

r is determi

function ce

variables. T

imension. T

n presented

neuron com

centre point

he spread v

Estimation”

ngth of the

a random

e accuracy

traditional

e, Random

hree layers,

er [154] as

neurons (or

values by

The input

yer. Hidden

ined by the

ntred on a

The spread

The centres

with the x

mputes the

t and then

values. The


- 91 -

resulting value is passed to the summation layer. In the summation layer, the

value coming out of a neuron in the hidden layer is multiplied by a weight

associated with the neuron (W1, W2, ...,Wn) and passed to the summation which

adds up the weighted values and presents this sum as the output of the network.

For classification problems, there is one output (and a separate set of weights and

summation unit) for each target category.

In this work, the training algorithm introduced by Chen et al. [155] is

employed to train the RBF networks. This algorithm uses an evolutionary

approach to determine the optimal center points and spreads for each neuron. It

also determines when to stop adding neurons to the network by monitoring the

estimated leave-one-out (LOO) error and terminating when the LOO error begins

to increase due to over-fitting. The computation of the optimal weights between

the neurons in the hidden layer and the summation layer is done using ridge

regression.

4.4.3 Linear Regression

Linear regression is the oldest and most widely used predictive model. The

method of minimizing the sum of the squared errors to fit a straight line to a set

of data points was published by Legendre in 1805 and by Gauss in 1809. A linear

regression model fits a linear function to a set of data points. The form of the

function is:

· · ··· · (4.13)

where Y is the target variable, X1, X2,… Xn are the predictor variables, , …,

are coefficients that multiply the predictor variables, is a constant and

is number of variable.

If a perfect fit existed between the function and the actual data, the actual

value of the target value for each record in the data file would exactly equal the

predicted value. In general, error of estimation between the actual value of the

target variable and its predicted value for a particular observation exist and is

known as the ”deviation” or ”residual”. Therefore, the goal of regression


- 92 -

analysis is to determine the values of the β parameters that minimize the sum of

the squared residual values for the set of observations. This is known as least

squares regression fit. Since linear regression is restricted to fitting linear

functions to data, it rarely works as well on real data but has a number of

strengths, as follows:

Linear regression is the most widely used method and it is well understood.

Training a linear regression model is usually much faster than methods such

as neural networks.

Linear regression models are simple and require minimum memory to

implement, so they work well on embedded controllers that have limited

memory space.

To use linear regression to fit functions with non-linear variables, the

transformed variables are used as predictor variables for the function. For

example, if a new variable, X2 is generated using the transformation, ·

and include both X and X2 as predictor variables, then the fitted function will be:

· · (4.14)

which is equivalent to

· · (4.15)

Linear regression is best suited for analysis with continuous variables.

However, it can also perform classification with multi-target classes. If the target

variable has more than two classes, a separate linear regression function for each

class is created and trained to generate “1” if the class it is modeling is true and

“0” for any other classes. Several computational algorithms can be used to

perform linear regression. In this work, the Singular Value Decomposition (SVD)

algorithms introduced by Mandel [156] was employed because it is robust and

less sensitive to variables that are nearly collinear.


- 93 -

4.5 Result of Fault Classification Performance

Using the eight selected features of the five progressive levels on four fault

conditions, the comparative fault diagnostic test based on five classification

algorithms was conducted. The test results of the five classifiers’ performance are

summarized in Figure 4.13. In Figure 4.13, most classifiers showed low classification

rates in first and second fault levels rather than higher levels (level 3, 4 and 5)

relatively. The poor performances of classification can be due to the over-fitting of

features at the initial four conditions. In addition, after the third level of fault, most

classifiers have accuracies reaching 100.0%, except for random forest. This result

indicates that the fault classification accuracy is variable depending on the severity of

machine fault and the type of classifiers.

Through this comparative test, it is verified that SVMs and RBF networks show

relatively outstanding performance for intelligent fault classification for the range of

faults propagation. Especially, SVMs shows better accuracies than the RBF networks

at initial fault condition (level 1).

Figure 4.13 Comparison test results of five classifiers’ performance


- 94 -

4.6 Summary

This chapter presented a comparative study of intelligent fault diagnostics using

five different classifiers to perform fault diagnostics in the proposed system. In

addition, to select an appropriate classifier for health state probability estimation,

these tests were also carried out using five different severity levels of three faults

from HP-LNG pumps.

From the analysis of the historical condition monitoring data and maintenance

records from HP-LNG pumps, three types of fault are used for comparative fault

diagnostic test. For the better performance of five classifiers, effective features were

selected using the distance evaluation technique.

The result of the test shows that the fault classification accuracy is variable

depending on the severity of faults and the type of classifiers. The SVMs and RBF

networks show relatively outstanding performance for intelligent fault classification

for the range of fault propagation. Furthermore, the two SVM classifiers show better

classification performance than the RBF networks at initial fault condition.

Through this confirmation of classification ability of SVMs for progressive fault

propagation data, the SVMs are employed in heath state probability estimation in the

proposed prognostic model for prediction of machine remnant life. The proposed

prognostic model based on health state probability estimation using SVM technique

is validated through a number of case studies in the following chapters.


- 95 -

CHAPTER 5 MODEL VALIDATION USING SIMULATED AND EXPERIMENTAL BEARING FAILURE DATA

This chapter presents two case studies for model validation of the proposed

prognostic model using simulation data of progressive bearing failure and

experimental bearing run-to-failure data. Section 5.1 describes the bearing fault

simulation method and model validation using these simulated bearing failure data.

Section 5.2 presents the designed experimental test rig for an accelerated bearing

failure test, and how these experimental data were used for validating the prognostic

model. In addition, the proposed model is also validated through the model

comparison test with the PHM model by using identical experimental data in Section

5.3. Finally, the chapter is then summarised and concluded in Section 5.4.

5.1 Model Validation Using Simulated Bearing Fault Data

5.1.1 Simulation of Progressive Bearing Fault Data

In general, a prognostic model requires numerous sets of failure data for

training and testing. Unfortunately, it takes a long time to fail a bearing, even in

accelerated run-to-failure tests. To resolve this dilemma, the simulation of

progressive bearing degradation data was developed by a former PhD candidate

[157] as a substitute of derived real life testing data. This simulated data provides

numerous sets of data and truncations were randomly imposed on a portion of the


- 96 -

datasets for the validation of prognostic model.

In this research, a vibration waveform generated by a rolling element bearing

under constant radial load with a single point defect is first modelled using the

MATLAB software and then repeatedly generated while increasing the defect

severity exponentially with some added discontinuities. To describe the

waveform generated by a rolling element bearing under constant radial load with

a single localised defect, the vibration signature can be expressed as

· · · · (5.1)

where is a series of impulses at the bearing fault frequency, the

bearing radial load distribution, the bearing-induced resonant frequency

and the exponential decay due to damping [158, 159]. The last component,

, represents the noise added to corrupt the signal.

Repetitious impulses at bearing fault frequency

An impulse is produced due to the rollover of a rolling element at the race

defect zones, which can be represented by the impulse function, . As the

shaft rotates, this impulse occurs periodically at the inner race, outer race and ball

element passing frequency, . The four characteristic fault frequencies of the

ball bearing were presented in Chapter 4. The period between the impulses will

be denoted as 1/ . With amplitude constant denoting the severity of

the defect, the series of impulses can be represented mathematically by the

equation

∑ (5.2)

Bearing Radial Load Distribution

Figure 5.1 shows the load distribution around the circumference of a rolling

element bearing. The instantaneous load at the contact point of the inner race

defect can be determined approximately by using the Stribeck equation [160] for

all | | and valued 0 everywhere else:


- 97 -

1 10

for | | (5.3)

This amplitude modulation affects the amplitude of impulses generated by the

defect. The impulse amplitude is assumed to be directly proportional to the

instantaneous load on the rolling element when it rolls over the defect.

Figure 5.1 Load distribution of a rolling element bearing

Bearing-induced resonant frequency

The impulses excite the natural frequencies of the bearing’s elements and its

supporting structures. Under idealised conditions, vibration induced by the

bearing at its natural frequency can be represented by a sinusoidal wave:

∑ 2 (5.4)

where denotes the resonant frequency of the bearing.

Exponential decay due to the damping

The resonant vibration is then attenuated exponentially to zero, with a

transient duration that depends on the bearing’s damping factor α. The decay

function can be defined by the equation

(5.5)

ψmax ψ

ψ

ψmax – Angular extent of the load zone


- 98 -

Noise

The last component, , represents the noise added to corrupt the signal.

To be use for prognostic model training and validation, white Gaussian noise

with zero mean and 0.12 standard deviation was added to the signal to

simulate real life situations. It is widely known that training data that is assumed

sufficiently rich should be generated by a broadband signal, such as white

Gaussian noise [161]. White Gaussian noise can be readily generated in the

MATLAB program.

To simulate progressive degradation data, the defective bearing signal derived

in Eq. (5.1) was repeatedly generated using the “for” looping function in

MATLAB. Each repetition i represents a measurement recorded at one data

collection point. However, just as each real-life degradation data collection

would give varied vibration signal with increasing severity, the defect severity

parameter of the simulated signal should also be increased at each recording

i. It has been observed in bearing life tests that bearing degradation signals

possess an inherent exponential growth [162]. Therefore, the defect severity

parameter was increased exponentially throughout the loop.

According to the above method, outer race fault, inner race fault, ball fault and

a combination of multiple faults were simulated in this case study. To simulate

random degradation data, these simulated signals had defect impulses that

increase at different rates and discontinuities. Figure 5.2 shows the simulated

time domain signal of a defective bearing with an outer race, an inner race and a

ball defect with shaft frequency set at 600rpm. For the training and test of the

proposed prognostic model, two random progressive degradation data were

simulated as shown in Table 5. 1.

Table 5.1 Simulated progressive bearing degradation data set

Data No Number of Sample RPM Sampling

frequency Applied bearing faults

1 100 600 20,000 BPFI, BPFO, BSF

2 100 600 20,000 BPFI, BPFO, BSF


- 99 -

Figure 5.2 Simulated time domain signal with increasing defect impulse

5.1.2 Feature calculation and selection

In this case study, ten statistical features from the time domain data and four

parameters in the frequency domain were calculated, giving a total of 14 features

shown in Table 5.2.

Table 5.2. Statistical feature parameters and attributed label from simulated data

Time Domain Parameters Frequency Domain Parameters Mean{1}, RMS{2}, Shape factor{3},

Skewness{4}, Kurtosis{5}, Crest factor{6}, Entropy estimation value{7}, Entropy

estimation error{8}, Histogram upper{9} and Histogram lower{10}

RMS frequency value{11}, Frequency centre value{12},

Root variance frequency{13} andPeak value{14}

For the better performance of SVM and the reduction of computational effort,

effective features were also selected using the evaluation method of feature

effectiveness introduced by Knerr et al. [149, 150], as depicted Chapter 4.

The distance evaluation criteria (α ) of the 14 features in this work are shown


- 100 -

in Figure 5.3. In order to select the effective features, a value of greater than 1.1

of a normalized distance evaluation criterion, αα 1.1 was used, where

α is distance evaluation criterion and α is mean value of α . From the results,

six features were selected as effective features as the distance evaluation criterion

value (α ) exceeded the threshold level. They meet the large distance evaluation

criterion (α ) as compared with other features. The selected six features were

Skewness, Kurtosis, Entropy estimation error, RMS frequency value, Frequency

centre value and Root variance frequency value. These features have a low

dispersibility in the same state and high dispersibility among different states.

Therefore, it could minimize the classification training error in each bearing

health state.

Figure 5.4 presents the trends of the selected features for health state

estimation of bearing failure. As shown in Figure 5.4, most of the selected

features are well represented with gradual progression of bearing degradation.

Skewness, Kurtosis and Entropy estimation error values were increased as time

passes, on the contrary, other features were formed to decrease.

Figure 5.3 Feature selection using distance evaluation criterion (Simulation test)


- 101 -

Figure 5.4 Trends of selected features for simulation test

5.1.3 Health State Estimation and Prediction of RUL

In this simulation test, the degradation steps of bearing failure were simply

divided into ten health stages for health state estimation without prior analysis of

failure pattern because the trends of selected features are not highly fluctuating

but are observed to be growing exponentially as shown in Figure 5.4.

The polynomial function was used as the basic kernel function of SVM. As a

multi-class classification method of SVM, the one-against-one (OAO) method

was applied to perform the health state probability estimation of bearing

degradation, as described in Chapter 3. Sequential minimal optimization (SMO)

proposed by Platt [163] was used to solve the SVM classification problem. For


- 102 -

selection of optimal kernel parameters (C, γ, d), the cross-validation technique

was used in order to obtain effective classification performance suggested by Hsu

et al. [164] so as to avoid over-fitting or under-fitting.

In this work, simulated bearing degradation data were divided into ten

degradation stages for the estimation of health state probability and prediction of

remnant life using the six selected features. In this RUL prediction of bearing

failure, closed and open tests were conducted. The closed test was conducted

using identical data sets for model training and test. On the other hand, different

test data sets were applied in the open test using identical training data sets which

were used in the closed test.

Closed Test of Simulation Data

In the closed test, once the ten states were trained using the six selected

features from Data1, the full data sets of Data1 (100 samples) were tested to

obtain each health state probabilities using the result of SVMs multi-

classification as described in Chapter 3.

Figure 5.5 shows the probability distribution of each health state of simulated

data1 that was also used for training of the ten degradation states. The first stage

probability started with 100% and decreased as long the as next state probability

increased.

Figure 5.5 Probability distribution of each health state

(Closed Test Using Simulation Data 1)


- 103 -

Although there were some overlaps in the middle zone of the display, the

probabilities of each health state well explain the sequence of ten degradation

states over the entire sample. Especially, the initial and final states are distinctly

separated.

For the prediction of RUL, the expected life was calculated using the time of

each training data set ( ) and their probabilities of each health state as expressed

in Eq. (3.7). Figure 5.6 shows the result of remnant life prediction and the

comparison between actual remaining life and estimated life. As shown in Figure

5.6, the overall trend of the estimated life follows the real remaining life of the

bearing failure. And the average prediction value was 95.05% over the entire

range of the data set. The average prediction value was calculated using the

following equation.

(5.6)

where is number of sample, : is actual RUL(%), and is estimated

RUL(%).

Figure 5.6 Comparison of actual RUL and estimated RUL

(Closed Test Using Simulation Data 1)

Open Test of Simulation Data

The open test on the second set of simulated bearing failure data (Data2)

consists of 100 sample sets was conducted using identical training data (Data1).

Figure 5.7 shows the probabilities of each health state of Data2. Compared with

% 1∑ ′

100


- 104 -

the closed test result from Data1, the first state probability shows a long-deferred

interval, and the final state probability does not reached higher probability than

the former state.


(Open Test Using Simulation Data 2)

The RUL is also estimated by using the time of each training data set and their

probabilities of each health state as depicted in Eq. (3.7). Figure 5.8 shows

comparison result between the estimated RUL and the actual RUL. Although

there are some margins of error in initial states, the estimated life in the latter half

of samples matches closely with the real remaining life of bearing failure. The

average prediction value was also calculated using Eq. (5.6). The average

prediction value was 92.5% over the entire range of the data set.


(Open Test Using Simulation Data 2)


- 105 -

5.2 Model Validation Using Experimental Bearing Failure Data

5.2.1 Design and Setup of Experimental Test Rig for Accelerated

Bearing Failure Test

In order to study the capabilities of the proposed prognostic model in a timely

manner, a test rig was designed to facilitate accelerated bearing life tests. The

schematic of the test rig is depicted in Figure 5.9.

Figure 5.9 Schematic of the bearing test rig

Figure 5.10 The test rig after assembly of all components

Bearing 2 Bearing 3

Motor

Radial load

Bearing 1 Bearing 4

Accelerometers Thermocouple

AE sensors

Spring load system

Bearing 4

Bearing 1 Coupling


- 106 -

This test rig has four test bearings on a shaft driven by an AC motor.

Couplings were used so that when a bearing fails, it can be extracted and replaced

easily without having to move the other bearings on the shaft. Figure 5.10 shows

the test rig after assembly of all components. As shown in Figure 5.10, a spring

was designed to apply a spring load on the two middle bearings (Bearings 2 and

3). The load can be adjusted accordingly by tightening or loosening the screw on

the spring mechanism. The two bearings at each shaft end will undergo the same

amount of load as the middle bearings due to the reaction force at the support.

Another advantage of being able to run four bearings at once is the option to

run bearings from brand new, defect-free condition to failure in a timely manner.

In this way, when a bearing is failing, the degradation of the other three is also

accumulating. Therefore the test will take a shorter time than one that runs from

brand new to failure one by one. Two accelerometers, two acoustic emission

(AE) sensors and a thermocouple were attached on each bearing housing

(Bearings 2 and 3) for measurement reading. Figure 5.11 shows the close-view of

the middle bearing assembly.

Figure 5.11 Close view of the middle bearing assembly


- 107 -

5.2.2 Accelerated Bearing Run-to-Failure Test

Prognostic experiments with test bearings that are induced with a prominent

crack or hole are less likely to develop natural defect propagation in the early

stages. Therefore, the accelerated bearing run-to-failure tests were conducted

with defect-free condition of bearings and excessive overloading conditions. In

this experimental test, SMT 61806 single row deep groove ball bearings were

used for the run-to-failure test at constant 1300 rpm of rotation speed. Table 5.3

summarizes the bearing specifications.

Table 5.3 Test bearing specifications for experiment

Inner Diameter

Outer Diameter Width

Dynamic Load

Rating

Static Load

Rating

Fatigue Load Limit

Reference Speed Rating

30 mm 42 mm 7 mm 4.29 kN 2.9 kN 0.146 kN 3200 rpm

Ball bearings were selected because of their lower load capacity and

premature failure with an over-load of the bearing. The 61806 bearings were

chosen because they have small balls but relatively large bore diameter. This

feature will ensure that the high load will be able to degrade the bearings without

bending and damage of the shaft.

Figure 5.12 shows the failed bearing after the run-to-failure test. In this

bearing run-to-failure test, two sets of bearing failure data were collected with

identical condition for the proposed model validation. The data sampling rate was

250 kHz and data collections were conducted by a National Instruments

LabVIEW program. The two collected vibration data sets are summarised in

Table 5.4.

Table 5.4 Experimental bearing failure data set

Test No Number of Sample

Bearing Position RPM Sampling

frequency Total

operation time1 912 3 1300 250K 683 Min

2 810 3 1300 250K 579 Min


- 108 -

Figure 5.12 The picture of failed bearing after run-to-failure test

5.2.3 Feature Calculation and Selection

Using vibration and AE data from the experimental test, a total of 28 features

were calculated from the time domain and the frequency domain. The same

features evaluation method as depicted in Chapter 4 was used for the selection of

effective features for the estimation of health state probability.

Figure 5.13 shows the distance evaluation criterion (α ) of 28 features in this

work. In order to select the effective features, the candidate defined a value

greater than 1.9 of normalized distance evaluation criterion, αα 1.9.

From the results, four features were selected as effective features compared with

the other features. The four selected features were RMS, entropy estimation value,

histogram upper value from vibration data and peak value from AE data. The

detailed descriptions of selected features are described in Chapter 2.3. Figure

5.14 presents the trends of each of the selected four features. The trends of the

four selected features show the dynamic and stochastic process of the real

bearing failure.


- 109 -

Figure 5.13 Feature selection using distance evaluation criterion (Experimental Test)

Figure 5.14 Trends of selected features for experimental test

5.2.4 Health State Estimation and Prediction of RUL

Through the prior analysis of failure patterns, six discrete degradation stages

were determined as the number of health states of bearing failure in this

experimental test because they indicated discrete health states relating to bearing

failure over the time of test. The prediction tests of bearing failure were


- 110 -

performed using the four selected features above. The training data sets for health

state estimation are summarized in Table 5.5.

Table 5.5 Training data sets for health state probability estimation of experimental test

State No. No. of samples ( )

Average operationtime ( , ) RUL (%) No. of

features 1 1 ~ 10 9 98.7% 4

2 301 ~ 310 571 16.4% 4

3 501 ~ 510 608 11.0% 4

4 701 ~ 710 645 5.6% 4

5 801 ~ 810 663 2.9% 4

6 903 ~ 912 682 0.1% 4

The polynomial function was used as the basic kernel function of SVM. In

multi-class classification method using SVMs, the OAO method was applied to

perform the health state probability estimation of bearing failure as described in

Chapter 3. In this experimental test of bearing failure, closed and open tests were

also conducted.

Closed Test of Experimental Data

Once the six health states were trained using the four selected features from

experimental data1 as depicted in Table 5.5, the full data sets of data1 (912

samples) were tested to obtain each health state’s probabilities.

Figure 5.15 shows the probabilities of each state of the experimental data1 that

was also used for training of the health states. The probability variation of health

state was perceived after 278 samples because an abnormal condition of bearing

was detected at this point of time. In general, the abnormal condition of the

bearings suddenly occurred at the early stage of defect development and

degraded rapidly. The probability distribution of the bearing health state

effectively presented the transition of bearing conditions as shown in Figure 5.14.

The entire probabilities of each stage explain the sequence of six degradation

states after starting at the abnormal condition, and are distinctly separated as

shown in Figure 5.15. The training error value was about 1.7% for the six health


- 111 -

states.


(Closed Test Using Experimental Data 1)

The expected life was also calculated by using the time of each training data

set ( ) and their probabilities of each health state as has be expressed in Eq. (3.7).



Figure 5.16 shows the closed test result with comparison between actual RUL

and estimated RUL. As shown in Figure 5.16, there were high margins of error

between the actual remaining useful life and the estimated life in the initial state

because of the long duration time of the normal condition. However, the

estimated life closely followed the actual remaining life after the beginning of

abnormal condition (540 minutes). The accuracy of prediction was also gauged

using the Eq. (5.6). The average prediction value was 86.32% over the entire


- 112 -

range of data set.

Figure 5.17 Close view of the period of bearing fault condition


Figure 5.17 shows the close view of the period of bearing fault condition with

comparison between actual remaining useful life and estimated life. After the

start of the bearing fault, the estimated RUL was closely matched with the actual

remaining life. The average prediction value after the beginning of the abnormal

condition (from 540 minutes) was 97.67%.

Open Test of Experimental Data

The second experimental test data consisted of the 810 sample sets employed

for the open test using identical training data as depicted in Table 5.5. Figure 5.18

shows the test results of probabilities of each health state.

As shown in Figure 5.18, the probability variations began after around 600

samples because an abnormal condition started at the time of about 600 samples

in the case of the second bearing test. Compared with the former result (Closed

Test), the probability of five states indicated relatively low values and was hard

to find out in the probability distribution. However, the probability distribution of

each health state effectively represented the dynamic degradation process of the

bearing health state after the beginning of the abnormal condition.


- 113 -


(Open Test Using Experimental Data 2)

Figure 5.19 shows the comparison between the actual RUL and the estimated

RUL. The estimated life of open test (data 2) also started to follow the actual

remaining life after the beginning of the abnormal bearing condition. The average

prediction value was 38.93% over the entire range of data set.




- 114 -

Figure 5.20 Close view of the period of bearing fault condition


Figure 5.20 shows the close view of the period of bearing fault condition for

the open test. Compared with the result of the closed test as shown in Figure 5.17,

the prediction result showed some low accuracy until after the starting of the

abnormal condition at 540 minutes. Furthermore, the difference between the

actual RUL time and the estimated RUL time at initial health state originated

from the different life time between training data (First Data Set, 683 minutes)

and test data (Second Data Set, 579 minutes) as described in Table 5.4. These

results indicate that accurate estimation of health states is achievable for

prediction of machine remnant life. Moreover, the proposed model also has the

capability to indicate abnormal machine conditions.

5.3 Model Comparison Using PHM

5.3.1 Proportional Hazard Model (PHM)

The proportional hazard model (PHM), which was originally proposed in the

medical research field, can model the uncertain relationships between multiple

indicators and time dependent failure rate. Cox's PH model [165] is a widely

accepted semi-parametric model for analysis of failures with covariates. It has

been successfully used for survival analyses in medical areas and reliability

predictions in accelerated life testing. In this case study, to compare the

performance of the proposed model, a model comparison was conducted using

the commonly used PHM because this model is also performed based on


- 115 -

historical failure data.

PHM is developed based on the hazard rate function and assumes that the

hazard rate under the covariate is the product of an unspecified

baseline hazard rate and a relative risk ratio, , where is the

regression coefficient vector. The model can be generally expressed as:

(5.7)

The significant flexibility of PHM is that the regression coefficients can be

estimated by maximizing the corresponding partial likelihood function without

specifying the baseline . On the other hand, if the baseline hazard is

specified, the usual maximum likelihood approach can be carried out to estimate

the parameters in the model.

Considering the hard failure by the baseline function and degradation

simultaneously, the hazard rate in the form of PHM can be expressed as:

(5.8)

where , , … consists of n degradation features at

given time . Note that the conditional hazard rate in Eq. (5.7) is a

function of time only. The corresponding reliability function conditional on the

history of degradation features up to time is:

: 0 (5.9)

For failure time distribution, the Weibull distribution is widely used. In a

special case, assuming the baseline hazard has the form of two-parameter,

Weibull yields:

(5.10)

where 0 and 0 are the shape and scale parameters of Weibull

respectively. The model is referred to as the Weibull PH model. This model is


- 116 -

utilized in this case study.

In order to estimate the parameters in the PHM, it is necessary to have the

historical data collected under the given operating conditions. The data consist of

aging times, feature sample paths and indicators of events (failure versus

censored). Then the likelihood function of the collected data is given by:

, , : 0

(5.11)

where is the set of failure times, is the set of surviving times, is

the failure time of the th unit and is either the failure time or the surviving

time of the th unit. The loglikelihood function can be expressed as:

, ,

(5.12)

where ln is the log-hazard rate and the integration

is implemented using the adaptive Simpson quadrature rule. The , ̂ and in

maximum likelihood estimate (MLE) can be obtained by maximizing the

loglikelihood function using Nelder-Mead’s algorithm. Then, the MLEs of the

reliability indices of interest can be obtained by substituting the MLEs of the

model parameters.

5.3.2 Prediction of Remnant Life Using PHM

This comparative study was conducted using the PHM algorithm developed in

[166]. Two vibration and AE data sets collected from the bearing test rig as

shown in Table 5.4 were also used for the model comparison. For the comparison

under identical conditions, the four selected features in the above section such as

RMS, entropy estimation value, histogram upper value from vibration data and

peak value from AE data were also used for the prediction of RUL using PHM.


- 117 -

The parameters of the PHM model were identified using the likelihood

function given by Equation (5.12). In order to obtain a better fit, the features were

transformed by taking nature logarithm and denoted by ln RMS ,

ln Entropy Estimation , ln Histogram Upper and

ln Peak respectively. For the PHM:

(5.13)

The MLE’s of the parameters of the PHM are presented in Table 5.6.

Table 5.6 Estimated parameters of PHM using experimental data 1

2.3988 541.7 0.4717 0.5874 0.0025 3.678e-014

Using these parameters, the RULs of the bearing failure were estimated

respectively. In the closed test, Table 5.7 presents the prediction results both for

the PHM and the proposed model including comparison with the actual RUL

after stating abnormal condition of bearing (570 min).

Table 5.7 Comparison of RUL prediction between PHM and proposed model (Closed Test using experimental data 1) Time minute 570 580 590 600 610 620 630 640 650 660 670 680 683

Actual RUL 113 103 93 83 73 63 53 43 33 23 13 3 0

Estimated RUL‐PHM 357 234 232 210 184 152 120 110 89 66 51 46 47

Estimation Error‐PHM 244 131 139 127 111 89 67 67 56 43 38 43 46

Estimated RUL‐Proposed Model 112 97 101 71 68 38 45 68 42 20 25 8 3

Estimation Error‐Proposed Model 1 6 8 12 5 25 8 25 9 3 12 5 3

In this Table, it can be seen that the estimated RUL from the proposed model

are in accordance with the actual remaining life of bearing, and outperform the

ones from the PHM model. Although the estimated RUL from PHM approached

the actual RUL closely according to the degradation of the bearing, the prediction

of the RUL still has significant difference between the actual RUL and the

estimated RUL compared to the results of the proposed model.


- 118 -

Table 5.8 shows the open test result of the second experimental data using

identical training data (Data 1) after stating of bearing faulty condition. In the

case of the second experimental test, it had different bearing degradation pattern

with long duration of normal condition (around 540 minutes) and rapid failure

after the start of the faulty condition compared with the first experimental data.

As shown in Table 5.8, the PHM model cannot provide accurate prediction

results compared with the actual RUL and the results of the proposed model.

Although the estimated RULs of PHM matches with the actual RULs as the final

bearing failure approaches, the prediction of the RUL still has a significant

difference between the actual remaining life and the estimated life shown in

Table 5.8. For instance, the PHM still has high estimation error value (108

minutes) compared with the estimation error of the proposed model (32 minutes)

at the final bearing failure stage (578 minutes). In this case study, it can be seen

that the proposed model provides a more accurate prediction capability than the

PHM model in these bearing failure cases.

The above prediction result of PHM originates from insufficient historical

events in this case study. For better prediction using PHM, extensive data on a

substantial failure are required. However, in this case study, only one failure data

was available to be used for the prediction test. Moreover, the test data which has

considerably different life time from that of the training data can result in large

estimation error value in prediction of RUL.

Table 5.8 Comparison of RUL prediction between PHM and proposed model (Open Test using experimental data 2) Time minute 549 552 555 558 561 564 567 570 573 576 578

Actual RUL 29 26 23 20 17 14 11 8 5 2 0

Estimated RUL‐PHM 540 366 280 238 204 178 157 140 126 114 109

Estimation Error‐PHM 511 340 257 218 187 164 146 132 121 113 108

Estimated RUL‐ Proposed Model 337 82 90 79 53 12 12 27 19 34 33

Estimation Error‐Proposed Model 308 56 67 59 36 2 1 19 15 32 32

The estimation of survival life using the PHM is based on the prediction of

degradation indicators. It is required to predict the degradation features and


- 119 -

consider the degradation characteristics [166]. For better prediction result of

PHM, stochastic process fitting methods are required for the dynamic and

stochastic degradations of machine failure. In this comparison study, each

bearing degradation features indicated a long and flat region of normal bearing

condition before the degradation initiation as shown in Figure 5.13. Therefore, it

is inappropriate to fit the degradation features globally using nonlinear functions

of time. The local fitting of degradation features after degradation initiation

appears to be more appropriate for nonlinear model fitting in this case.

5.4 Summary

In this chapter, the proposed prognostics model was validated using two sets of

bearing failure data; the bearing fault simulation data and experimental bearing run-

to-failure data. In addition, model comparison study was also conducted using the

commonly applied PHM model.

The two vibration waveform data, which have a combination of multiple faults

such as outer race fault, inner race fault and ball fault, were simulated including

exponentially increasing the defect severity with some added discontinuities for the

validation of the proposed model. For the experimental validation of the proposed

model, the accelerated bearing failure test rig was designed and developed. Then two

bearing run-to-failure tests were conducted to obtain the progressive bearing failure

data for prognostics. To increase the performance of the SVM classifier and the

selection of sensitive degradation features for the health state estimation, effective

features were selected using an evaluation method of feature effectiveness.

The results from two actual case studies indicate that accurate estimation of health

states is achievable and also provides long-term prediction of machine remnant life.

In addition, the results of the experimental test show that the proposed model also

has the capability to provide early warning of abnormal machine conditions.

Through the comparison study using PHM, it was verified that the proposed

prognostic model based on health state probability estimation can provide a more


- 120 -

accurate prediction capability than the commonly used PHM in this bearing failure

case study.

.


- 121 -

CHAPTER 6 MODEL VALIDATION THROUGH INDUSTRY CASE STUDY

To verify the applicability of the proposed model in real industry, this model was

evaluated through two industry case studies using HP-LNG pump failure data.

Section 6.1 presents the prognostics of impeller rubbing failure of a HP-LNG pump.

In this case study, two sets of impeller-rubbing data were analysed and employed to

predict the remnant life of a pump based on estimation of health state probability

using the SVM classifier as described in Chapter 3. In Section 6.2, the second case

study was conducted using two bearing failures data from another HP-LNG pump. In

addition, the optimal number of health states of bearing failure was investigated

through the comparison test of a range of health states. The assessment results of

each case study are summarised in Section 6.3.

6.1 Prognostics of Impeller Rubbing Failure in HP-LNG Pump

6.1.1 Data Acquisition of Excessive Impeller Rub in HP-LNG Pump

The identical HP-LNG pumps presented in Chapter 3 are employed in the case

study of impeller rubbing failure prognostics. To conduct the prognostics of

impeller rubbing failure, two sets of progressive impeller rubbing failure data

were applied to predict the RUL of pump. The excessive impeller rub fault was

confirmed through the historical CM data and maintenance record analysis as

described in Chapter 3.

The acquired vibration data from the pumps are summarized in Table 6.1. As


- 122 -

shown in Table 6.1, a total 50 vibration samples from the P701B pump and 55

vibration samples from the P701D pump were collected during the full pump life

for training and testing of the proposed prognostics model, respectively.

Although these two impeller-rub cases had different fault severities due to the

impeller and housing wear, these faults indicated similar failure patterns over the

total operational time.

Table 6.1 Acquired impeller rubbing data from the HP-LNG pump

Machine No

Total operation hours

Reason of removal & Root cause

No of sample data

Sampling frequency

P701 B 2,488Hrs High Vibration & Excessive wear of impellers(#1-7) 50 8,192 Hz

P701 D 2,218Hrs High Vibration & Excessive wear of impellers(#1-9) 55 8,192 Hz


A total of 14 features (14 parameters, 1 position) from vibration data were

used for the prognostics of impeller rubbing failure. To select the effective

features, the distance evaluation criterion (α ) were also calculated from the 14

features according to the method described in Chapter 3. The distance evaluation

criterion (α ) of 14 features are shown in Figure 6.1. The effective features which

have a value greater than 1.5 of normalized distance evaluation criterion, α

α 1.5 were selected. As shown in Figure 6.1, five features were selected

as effective features compared with the other features in this prognostics test. The

selected five features are RMS, Entropy estimation, Peak, Histogram upper and

lower values.


- 123 -

Figure 6.1 Feature selection using distance evaluation criterion for prognostics

6.1.3 Health State Estimation

In this work, through the historical data analysis of the impeller rubbing fault

cases, six discrete degradation stages as the health states of impeller rubbing

failure were applied for health state estimation as they indicated progressive

stages of fault severity over the operation of the pumps. The training sets were

determined to effectively represent the discrete health state of the impeller

rubbing failure through the historical failure pattern analysis. Table 6.2 shows the

training data sets of the six health states to obtain the probability distribution of

each health stage.

Table 6.2 Training data sets for the health state probability estimation (P701D)

State No. No. of samples ( ) Average operation Hours ( ) RUL (%) No. of

features1 1 ~ 5 232 89.5% 5 2 6 ~ 10 819 63.1% 5 3 16 ~ 20 1,395 37.1% 5 4 31 ~ 35 1,611 27.4% 5 5 36 ~ 40 1,734 21.8% 5 6 51 ~ 55 2,081 6.2% 5

As the basic kernel function, the polynomial function was used in this case

study. The OAO method for multi-class classification was applied to perform the

estimation of health state probability as described in Chapter 3. Sequential


- 124 -

minimal optimization (SMO), as proposed by Platt [163], was used to solve the

SVM classification problem. For the selection of optimal kernel parameters (C, γ,

d), the cross-validation technique is used to avoid over-fitting and under-fitting

problems, as suggested by Hsu. [164].

In this RUL prediction of impeller rubbing failure, closed and open tests were

also conducted. In the closed test, the six health states were trained using the

listed training data sets shown in Table 6.2, and full data sets from P701 D (55

data sets) were tested to obtain the probabilities of the six degradation states

using Eq. (3.5).

Figure 6.2 shows the probabilities of each state of P701 D. The first state

probability started with 100% and decreased as long as the next state probability

increased. For example, the first state (solid lines) has the probabilities dropping

to zero while simultaneously the second state (dotted lines) reaches 100%. Some

overlaps between the states and also non uniformity of the distribution could be

due to the complex nature of machine degradation and the uncertainty of machine

health condition in real environment. The entire probabilities of each state follow

a non-linear degradation process and are distinctly separated.

Figure 6.2 Probability distribution of each health state (Closed Test, P701 D)

In the open test, similar impeller rubbing failure data (P701 B) which

consisted of 50 sample sets were tested to obtain the probability distribution of

each health state of P701 B using identical training data sets shown in Table 6.2.

Figure 6.3 shows the probability distribution of each health state of P701 B.


- 125 -

Similar non-linear probability distribution and overlap between states were also

observed due to reasons explained above. The fourth state has relatively low

probabilities of about 20 %, concentrated in the middle zone. The final state

(sixth state) of 100% probability has a long duration period at the final failure

compared with P701 D.

Figure 6.3 Probability distribution of each health state (Open Test, P701 B)

6.1.4 RUL Prediction

The machine remnant life of rubbing failure was estimated by using the

historical operation hours ( ) of each training data sets (see Table 6.2) and their

probabilities evaluated using Eq.(3.5). Figure 6.4 shows the closed test result of

the estimated remnant life and the comparison between actual RUL and estimated

life. As shown in Figure 6.4, although there are some discrepancies in the middle

zone of the display, the overall trend of the estimated life follows the gradient of

actual RUL of the machine. The average prediction accuracy was 95.6%, which

was calculated using Eq. (5.6) over the entire range of the data set. Furthermore,

the estimated RUL at the final state matched closely with the actual RUL.


- 126 -


In the open test, remnant life prediction of the P701 B pump was estimated

using the historical operation hours ( ) of identical training data sets and their

probabilities as shown in Figure 6.3.

Figure 6.5 shows the open test result of estimated remnant life and the

comparison between actual RUL and estimated RUL. There is a large difference

in remnant life at the initial degradation states as shown in Figure 6.5. This is due

to the estimated time calculated from training data sets (P701 D) which had 2,218

hours in total operation as depicted in Table 6.2. Therefore, this causes the

discrepancy between actual RUL and estimated RUL at the beginning of the test.

However, as it approaches final failure, the estimated RUL matched closely with

the actual RUL compared those in the initial and middle states.

Figure 6.5 Comparison of actual RUL and estimated RUL (Open Test, P701 B)


- 127 -

6.2 Prognostics of Bearing Failure in HP-LNG Pump

In this industry case study, RUL prediction tests were also conducted using

bearing failure data of HP-LNG pumps to validate the feasibility of utilising the

health state probability estimation with historical knowledge for accurate long term

failure prediction.

6.2.1 HP-LNG Pump

A different type of HP-LNG pump was used in this prognostic test of bearing

failure. Figure 6.6 shows the pump schematic and vibration measuring points

applied for this case study. Compared with the former HP-LNG pump described

in Chapter 3, this pump has three ball bearings to support the entire dynamic load

of the integrated shaft of pump and motor. These pumps also have different

designs of impeller diffuser type and rotor and shaft assembly, with a short length

of shaft. The submerged motor and bearings are also cooled and lubricated by a

predetermined portion of the LNG being pumped. For condition monitoring of

the pump, two accelerometers are installed on the housing near the bottom

bearing assembly and in two radial directions as shown in Figure 6.6.

Table 6.3 shows the pump specifications. These high-pressure LNG pumps are

submerged and operate at super cooled temperatures and high speed (3,600rpm).

Poor lubricating conditions and high operating speed can result in rapid bearing

failure when certain faults of bearing components occurred. Hence, accurate and

long term prediction of bearing failures is essential for safe operation and

optimisation of pump maintenance schedule.

Table 6.3 Pump Specifications of different type of HP-LNG pump

Capacity Pressure Impeller Stage Speed Voltage Rating

241.8 m3/hr

88.7 kg/cm2. g 9 3,585 RPM 6,600V 746 kW

Upper Bearing No

Bottom Bearing No

Tail Bearing No

Rotor Bar Quantity

Diffuser Vane No Current

6314 6317 6311 41 EA 8 EA 84.5 A


- 128 -

Figure 6.6 Pump schematic and vibration measurement points of different type

of HP-LNG pump

6.2.2 Data Acquisition of Bearing Failure

For machinery fault diagnosis and prognosis, signals such as vibration,

temperature and pressure are commonly used. In this research, vibration data was

used because it is readily available in industry and the trend of vibration features

are closely related with the bearing failure degradation process. Figure 6.7 shows

the frequency spectrum plots of the P301D pump. The bearing resonance

component increased over the period of operation hours. The first symptom of a

bearing failure was detected as early as 14 months before the bearing final failure.

Other bearing fault components appeared progressively until the final bearing

failure, as shown in plots (a) to (d) of Figure 6.7.


- 129 -

Figure 6.7 Spectrum plots of P301D pump bearing failure


- 130 -

Since bearing defects generate vibration in the form of impacts, the

fundamental bearing defect frequencies often are accompanied by multiple

harmonically related frequencies as well. In general, when wear progresses, inner

raceway defect frequencies are sometimes accompanied with other bearing defect

frequencies in real environment. Although the major defect of P301 D was inner

race defect, some harmonics of ball and outer race defect frequencies were also

accompanied with the inner race defect frequencies according to the progress of

bearing defect as shown in plot (d) of Figure 6.7.

Vibration data were collected through two accelerometers installed on the

pump housing as shown in Figure 6.6. The vibration data from two LNG pumps

of identical specifications were used for prediction of the remaining useful life.

Due to the random operation of the pumps to meet the total production target

of LNG supply, there were some restrictions limiting the collection of data over

the entire life of the pump. The acquired vibration data are summarized in Table

6.4. As shown in Table 6.4, a total 136 vibration samples for P301 C and 120

vibration samples for P301 D were collected during the full range of operation

over the life of the pump, for training and testing of the proposed prognosis

model.

Table 6.4 Acquired vibration data of bearing failure

Machine No.

Total operation

hours

Reason of remove & Root cause

No. of sample data

Sampling frequency

P301 C 4,698 High Vibration & Outer raceway spalling 136 12,800 Hz

P301 D 3,511 High Vibration & Inner raceway flaking 120 12,800 Hz

Figure 6.8 shows the damage of (a) the outer raceway spalling of P301 C and

(b) the inner raceway flaking of P301 D, respectively. Although these two

bearing faults had different fault severities on the inner race and the outer race,

these faults occurred on similar bearings located on the same location of the

pump.


- 131 -

(a) Outer raceway spalling of P301 C (b) Inner raceway flaking of P301 D

Figure 6.8 Outer and inner race bearing failures


Although bearing faults are the primary causes of machine breakdown, a

number of other component faults can also be embedded in bearing fault signals

which make it problematic for bearing diagnosis/prognosis. Currently, a number

of physical model-based prognoses have been reported which focused on

identifying appropriate features of damages or faults. However, current

researches of prognostics only concentrate on specific component degradations

and do not include other types of fault. In this research, the candidate aims to

address a generic and scalable prognosis model which is applicable for different

faults in identical machine. The conventional statistical parameters from the

vibration signals are used for prognostic tests to establish the generic and scalable

prognosis model in this study. In this case study, a total of 28 features (14

parameters, 2 positions) were also calculated for health state probability

estimation of bearing failure. The calculated features from the two sets of

vibration data of HP-LNG pumps are summarised in Table 6.5.

Table 6.5 Statistical feature parameters and attributed label from bearing failure data

Position Time Domain Parameters Frequency Domain Parameters Acc.(A) Mean{1}, RMS{2}, Shape factor{3},

Skewness{4}, Kurtosis{5}, Crest factor{6}, Entropy estimation value{7}, Entropy estimation

error{8}, Histogram upper{9} and Histogram lower{10}

RMS frequency value{11}, Frequency centre value{12},

Root variance frequency{13} andPeak value{14}

Acc.(B)


- 132 -

To select the optimal parameters that can fully represent failure degradation,

effective features were selected using a feature selection method based on the

distance evaluation technique as discussed in Chapter 4. The reduction of feature

dimension leads to better performance of SVM and reduction in computational

effort.

In this work, a total 14 of features were used to extract effective features from

each signal sample measured at identical accelerometer positions. The distance

evaluation criterion (α ) of 14 features in this work are shown in Figure 6.9, with

almost zero value for histogram upper (No. 9). In order to select the effective

degradation features, the candidate defined a value greater than 1.3 of a

normalized distance evaluation criterion, αα 1.3, where α is distance

evaluation criterion and α is mean value of α . The ratio of 1.3 is selected

based on past historical records for this particular bearing/pump. From the results,

three features were selected for health state probability estimation, namely

Kurtosis {5}, Entropy estimation value {7} and Entropy estimation error value

{8}. They meet the large distance evaluation criterion (α ) as compared with

other features. These features could minimize the classification training and test

error of each health state.

Figure 6.9 Distance evaluation criterion of features.


- 133 -

Figure 6.10 shows the selected feature trends of kurtosis, entropy estimation

and entropy estimation error value, respectively. All the selected features show

increasing trends which indicate the failure degradation process of the machine

over time as shown in the plots.

Figure 6.10 Feature trends of selected features

6.2.4 Selection of Number of Health States for Training

In this case study, to select the optimal number of health states of bearing

degradation, several health stages were investigated using the data sets of P301 D

for training and prediction tests. As the basic kernel function of SVM, a


- 134 -

polynomial function was used in this work. Multi-class classification using the

OAO method was applied to perform the classification of bearing degradation as

described in Section 3. Sequential minimal optimization (SMO) was used to

solve the SVM classification problem. For the selection of optimal kernel

parameters (C, γ, d), the cross-validation technique was also used in order to

avoid over-fitting or under-fitting problems of classification performance. The

result of the investigation to select the optimal number of health states are plotted

in Figure 6.11. The average prediction value was estimated using Eq. (5.6).

Figure 6.11 Result of investigation to determine optimal number of health states.

A total of nine different states were investigated, ranging from two to ten

states. As shown in Figure 6.11, although low health states have low training

error values, they show high prediction error values compared with other higher

health states. On the contrary, high health states also have high training error

values but relatively low prediction error values. From this result, five health

states was selected as the optimal number of health states because beyond five

states the training error values increased rapidly and without significant decrease

in the prediction error values. The training error and prediction error values of the

five states were 10% and 5.6%, respectively.

Table 6.6 shows the training data sets of the selected five degradation states

used in this work and with eight sets of samples in each state using the three

selected features. Initially (Stage 1) the percentage of RUL was almost 100%

(99.89%) and progressively reduced to 28.77% in stage 4. At 5th stage, the

remaining bearing life was about 3.02%.


- 135 -

Table 6.6 Training data sets for the health state probability estimation (P301D)

Stage No. No. of samples ( ) Average operation Hours ( ) RUL (%) No. of

features1 1 ~ 8 4 99.89% 3

2 25 ~ 32 503 85.67% 3

3 41 ~ 48 843 75.99% 3

4 81 ~ 88 2,501 28.77% 3

5 121 ~ 128 3,405 3.02% 3

6.2.5 RUL Prediction of Bearing Failure

In the RUL prediction of bearing failure, closed and open tests were conducted.

In the closed test, the five states were trained using the listed training data sets

shown in Table 6.6, and full data sets from P301 D (136 data sets) were tested to

obtain the probabilities of the five degradation states. Figure 6.12 shows the

probabilities of each state of P301 D. The first state probability started with

100% and decreased as long as the next state probability increased. For example,

the first state (solid lines) has the probabilities dropping and increasing again

until about 90% and eventually dropped to zero (at sample 30), while

simultaneously the second state (dotted lines) reached 100%. Some overlaps

between the states and also non uniformity of the distribution could be explained

due to the dynamic and stochastic degradation process and the uncertainty of

machine health condition or inappropriate data acquisitions in a real environment.

The entire probability of each state follow a non-linear degradation process and

are distinctly separated.

In the open test, similar bearing fault data (P301 C), which consisted of 120

sample sets, were tested to obtain the probability distribution of each health state

of P301 C using identical training data sets shown in Table 6.6. Figure 6.13

shows the probability distribution of each health state of P301 C. Similar non-

linear probability distribution and overlaps between states were also observed

due to reasons explained above.


- 136 -

Figure 6.12 Probability distribution of each health state (Closed Test, P301 D)

Figure 6.13 Probability distribution of each health state (Open Test, P301 C)

The machine remnant life of bearing failure was estimated by using the

historical operation hours ( ) of each training data sets described in Table 6.6

and their probabilities evaluated using Eq.(3.5). Figure 6.14 shows the closed test

result of the estimated remnant life and the comparison between actual RUL and

estimated RUL. As shown in Figure 6.14, although there are some discrepancies

in the middle zone of the display, the overall trend of the estimated RUL follows

the gradient of actual remaining useful life of the machine. The average

prediction accuracy was 94.4%, calculated using Eq. (5.6) over the entire range

of the data set. Furthermore, the estimated RUL at the final state matched closely

to the actual RUL with less than 1% of remaining life.


- 137 -


Figure 6.15 shows the open test result of estimated remnant life and the

comparison between actual RUL and estimated RUL. There is a large difference

in remnant life at the initial degradation states as shown in Figure 6.15. For the

open test, the estimated RUL time was obtained based on the training data sets

(P301 D) which had 3,511 hours in total operation. This caused the discrepancy

between actual RUL and estimated RUL in the beginning of the test. However, as

it approached final bearing failure, the estimated RUL matched more closely to

the actual remaining useful life than those in the initial and middle states.


(Open Test, P301 C)


- 138 -

6.2.6 Verification of Optimum Number of Health States

In this case study, several tests of different health states were also conducted

to verify the optimum number of health states, ranging from two states to ten

states using same test data (P301 C).

Figure 6.16 shows the test result of training and prediction errors of these

health states. Health states from two to five show a high prediction error and

settled down at about 7.45% error at state No. 5, while the training error increases

as the number of states increases and stabilized between states Nos. 4 and 5.

However, beyond five states, the training error values increased rapidly in the

classification while the average prediction errors remain relatively constant.

Although states Nos. 4 and 5 have almost similar training error, the prediction

error at state No. 5 was much lower than state No. 4. Therefore, the selected five

health states were verified as optimal health states for the estimation of health

state probability in this case study. It has to be noted that different health stages

need to be evaluated for different case studies.

Figure 6.16 Training and prediction values of several health states (P301 C)


- 139 -

6.3 Summary

The proposed prognostic model was successfully validated through two industry

case studies. Through prior analysis of historical data in terms of historical

knowledge, discrete failure degradation stages were employed to estimate discrete

health state probability for long term machine prognosis. In both case studies, for

optimum performance of the classifier, the prominent features were selected using

the distance evaluation method. The health state probability estimation was carried

out using a full failure degradation process of the machine over time from new to

final failure stages.

In the proposed model, the determination of the number of health states in

machine failure process plays a significant role for accurate estimation of machine

remnant life. Therefore, in the second case study of bearing failure prediction, the

optimal number of health states was selected through the investigation of several

health states. The selected optimum health states led to reduction of the training error

of health state estimation without significant decrease of the prediction error values

in this case study.

The results from two industrial case studies indicate that the proposed model has

the capability to provide accurate estimation of machine health condition for long-

term prediction of machine remnant life.


- 140 -

CHAPTER 7 CONCLUSION AND FUTURE WORK

7.1 Conclusion

The ability to accurately predict the RUL of a machine is critical for its operation,

and can also be used to extend production capability; and to enhance the system’s

reliability. Effective diagnostics and prognostics are important aspects of CBM for

maintenance engineers to schedule a repair and to acquire replacement components

before the components eventually fail. Through an extensive literature review on

machine diagnostics and prognostics, this thesis addresses four critical challenges

and problems in machine fault prognostics such as accurate long term prediction,

sufficient usage of effective features, generality, scalability and the problem of

systematic incorporation of diagnostic information and historical knowledge.

With consideration to challenges in machine fault prognostics, the novel approach

to designing integrated diagnostic and prognostic systems based on health state

probability estimation has been presented in this thesis. This work concludes that:

The integration of fault diagnostic and prognostic system is confirmed to be

effective for accurate prediction of machine remnant life. The proposed model

has a closed loop architecture in configuration with an integrated diagnostics

and prognostics system based on health state probability estimation, with

embedded historical knowledge. Through the integrated system with fault

diagnostics, a more precise failure pattern from a number of historical

degradation data can be employed in prognostics through the prior verification

(isolation) of impending faults. With this scheme of the proposed model, a


- 141 -

generic and scalable model is also established for the application of different

failing components. The proposed prognostic model has been successfully

tested and validated by applying it to a number of cases from a simulation test

to industry applications of HP-LNG pump failures. The results from case

studies indicate that accurate estimation of health states is achievable, which

would provide accurate prediction of machine remnant life. In addition, the

results of experimental tests show that the proposed model has the capability

to provide early warning of abnormal bearing conditions by indicating the

transitional health state of machine failure effectively.

The novel methodology of machine health state estimation applied to discrete

degradation process of machine failure enables accurate long-term prediction

of machine remnant life. None of the current prognostic models have

considered using discrete health state probability, which can effectively

represent the dynamic and stochastic degradation of the machine failure.

Current prognostic techniques only consider specific component degradations

and mainly applied in the laboratory environment for model validation. In this

research, the outcome of health state estimation provides an accurate real time

failure index for the prediction of machine remnant life. The proposed model

also enables a sufficient usage of a range of condition indicators to effectively

represent the complex nature of machine degradation by using the ability of

classification algorithms in health state probability estimation. In case studies,

a number of effective features (up to eight features) were used for health state

estimation. Furthermore, this full utilization of a range of features leads to a

generic and scalable prognostic model for the practical application in industry.

A systematic approach incorporating diagnostic information and historical

knowledge for accurate RUL prediction. The proposed prognostics model

integrates effective feature extraction and fault diagnostics to obtain the best

possible RUL prediction and to minimise the uncertainty in interpretation of

machine degradation. This scheme supports the prognostic system on how to

manage the historical knowledge in conjunction with machine fault

diagnostics and prognostics. In this thesis, the embedded historical knowledge

provides key references for real time fault diagnostics and health state


- 142 -

estimation. The outcomes of integrated diagnostics and prognostics can then

be used for system updating and improving of the prognostics model by

providing reliable posterior degradation features for diverse failure modes and

fault types. The accumulated information also provides a good guideline to

solve the CM data management problems in many industries which are

suffering from huge storage of CM data. This scheme leads to improved

model scalability for applications of various faults and failure patterns. The

proposed prognostic model has been successively validated using two

different industrial fault data for the model scalability. The results from two

industrial case studies also indicate that the proposed model has the capability

to provide accurate estimation of health condition for accurate prediction of

machine remnant life.

The comparative study of intelligent diagnostics using five different

classification algorithms. The comparative diagnostic tests were conducted

using five different classifiers applied to progressive fault levels of three fault

types in the HP-LNG pump. Although many intelligent fault diagnostic

models have been validated using a number of machine fault data, none of

them consider different severity levels in fault propagation to estimate the

fault diagnostic performance. The result of a comparison test shows that the

fault classification accuracy is variable and depending on the severity of the

machine fault and the type of classifier. The SVMs show relatively

outstanding performance for intelligent fault classification in the range of fault

propagation among commonly used classifiers. Therefore the SVM technique

is employed in health state probability estimation for prediction of machine

failure in this research.

Investigation of the optimal number of health states for better prediction of

machine remnant life. In the industrial case study of bearing failure, the health

state probability estimation was carried out using a full failure degradation

process of the machine over time from new to final failure stages. The optimal

number of health states was validated through the investigation of several

number of health states in the case study of bearing failure prediction of HP-

LNG pumps. It has been confirmed that the selected optimum health states


- 143 -

have led to minimising the training error of health state estimation without a

significant decrease in the prediction error values in this case study.

The results of model comparison indicate that the proposed model has a more

accurate prediction capability. The model comparison study with the

Proportional Hazards Model has been conducted under identical conditions

using experimental bearing failure data. Through the comparison between the

proposed model and the PHM with the actual remaining life of bearing, it is

verified that the proposed prognostic model based on health state probability

estimation provides a more accurate RUL prediction than the commonly used

PHM in the case of dynamic and stochastic process of machine degradation.


- 144 -

7.2 Future Work

Through model validation using simulated and experimental data, and industry

case studies, several new research issues have been identified and described as

followings:

Although the proposed prognostic model is shown to be effective through

several case studies from simulation tests to industrial applications, further

validations of different machine system failures such as gear box, tool wear,

structural corrosion, motor and engines still remain as an area of future work

to establish a generic and scalable asset health management system.

The signal processing and feature extraction techniques are fundamental to

the development of a robust diagnostics and prognostics model for certain

fault types and failure patterns. In this thesis, the proposed model mainly

used conventional standard features from vibration CM data. Therefore,

other feature extraction methods from different CM data need to be explored

to extract appropriate health indicators.

One novelty of the proposed model is health state probability estimation for

accurate long term prediction of remaining useful life of a machine. The

selection of a number of optimal health states of component failure is vital in

order to avoid high training error with high prediction accuracy. Even

though the optimum health degradation stages were determined in this work

by using several health states in industry case study, new approaches using

current available optimization algorithms and pattern recognition techniques

for the optimization of health degradation stages is still required to be

developed. It is shown in this work that the number of health states plays a

significant role in providing accurate machine failure prognosis.

Although the proposed model makes use of sufficient health indicators in the

prediction of machine remnant life, there is also a limitation in using many

features due to the problem of dimensionality in classification process,

which may cause computer overload and over-fitting of training data. In a


- 145 -

supervised learning setting with many input features, over-fitting is a

potential problem unless there is ample training data.

To avoid this dimensionality problem in using a number of health indicators

in the proposed model, a tensor based method for health state probability

estimation can be used as an alternative to traditional classification

techniques. Most of the traditional learning/classification algorithms are

based on the Vector Space Model (VSM). That is, the data are represented as

vectors x . The learning algorithms aim at finding a linear (or

nonlinear) function wTx according to some pre-defined criteria,

where w , … , T are the parameters to estimate. However, in

Tensor Space Model (TSM), a data sample is represented as a tensor [167].

Each element in the tensor corresponds to a feature. For a data sample

x , it can be converted into the second order tensor (or matrix)

x , , where , . Tensor based approaches can perform

data analysis in high dimensional spaces. Therefore, the utilisation of TSM

for health state probability estimation is suggested as a possible future work

for full utilization of input parameters.

Finally, for real application and convenient implementation of the model in

industry, it is necessary to develop an integrated health management

software tool based on health state probability estimation which can be used

in fault detection, diagnostics and prognostics of machine components.


- 146 -

APPENDIX

Basic binary classification theory of SVMs

Given a set of input data 1, 2, … , where M is the number of samples.

The ith sample in an n-dimension input space belongs to one of two classes

labelled by 1, 1 namely, positive class and negative class. For linear data, it

is possible to determine the hyperplane 0 that separates the given input data.

∑ 0 (A.1)

where is the coefficient vector and is the bias of the hyperplane. The vector

and scalar are used to define the position of the separating hyperplane. The

decision function is made using sign to create a separating hyperplane that

classify input data into either positive class or negative class. A distinctly separating

hyperplane should satisfy the constraints

1, if 11, if 1 (A.2)

or it can be presented in a complete equation

1 for 1, 2, … (A.3)

The separating hyperplane that creates the maximum distance between the plane

and the nearest data, i.e., the maximum margin, is called the optimal separating

hyperplane (OSH). An example of the optimal hyperplane of two data sets is

presented in Figure A.1.


- 147 -

Figure A.1. Binary classification using SVMs[168]

By taking into account the noise with slack variables and error penalty , the

optimal hyperplane separating the data can be obtained as a solution to the following

optimization problem

minimise ∑ (A.4)

subject to 1 , 1, 2, … 0, 1, 2, … (A.5)

where is the measured distance between the margin and the samples that

lying on the wrong side of the margin. The calculation can be simplified by

converting the problem with the Kuhn-Tucker condition into the equivalent

Lagrangian dual problem, which will be

minimise , , ∑ ∑ (A.6)

The task is minimising Eq. (A.6) with respect to and , while requiring the

derivatives of to to vanish. At the optimal point, the following saddle point

equations are applied

Positive Class

Negative Class

{ }1H : | ( ) 1b⋅ + = +x w x

{ }H : | ( ) 0b⋅ + =x w x{ }2 : | ( ) 1H b⋅ + = −x w x

Margin

b−w


- 148 -

0, 0 (A.7)

which can be replaced by

∑ , ∑ 0 (A.8)

From Eq. (A.8), is contained in the subspace spanned by the . By

substitution Eq. (A.8) into Eq. (A.7), the dual quadratic optimization problem is

obtained

maximise ∑ ∑ , (A.9)

subject to 0, 1, 2, … . ∑ 0 (A.10)

Thus, by solving the dual optimization problem, one obtains the coefficient

which is required to express so as to solve Eq. (A.4). This leads to the non-linear

decision function,

∑ , (A.11)

SVMs can also be used in non-linear classification tasks with the application of

Kernel functions. The data to be classified is mapped onto a high-dimensional feature

space, where linear classification can be applied.

Using the non-linear vector function, Eq. (A.12) to map the n-dimensional input

vector onto one-dimensional feature space

Φ , … (A.12)

The linear decision function in dual form is given by

∑ , Φ Φ (A.13)

Working in high-dimensional feature space enables the expression of complex

functions. But it can also generate other problems. Computational problems can

occur due to the large vectors and the overfitting problem can also exist due to the


- 149 -

high-dimensionality. The latter problem can be solved by using Kernel functions.

The Kernels are a function that returns a dot product of the feature space mappings of

the original data points, as stated below

, Φ Φ (A.14)

When applying a Kernel function, learning in the feature space does not require

explicit evaluation of Φ and the decision function will be

∑ , , (A.15)

Any function that satisfies Mercer’s theorem [169] can be used as a Kernel

function to compute a dot product in feature space. There are different Kernel

functions used in SVMs, such as linear, polynomial and Gaussian RBF. The Kernel

defines the feature space in which the training set examples will be classified.

The selection of the appropriate Kernel function is very important, since the

Kernel defines the feature space in which training set examples will be classified.

The definition of a legitimate Kernel function is given by Mercer’s theorem, which

states that the function must be continuous and positive definite. Table A.1 shows the

formulation of linear, polynomial and Gaussian RBF functions respectively.

Table A.1 Formulation of Kernel functions

Kernel Linear Polynomial Gaussian RBF

Formulation, , · γ · , γ 0 – – /2γ

SVMs Quadratic Programming (QP) problem

Vapnik [170] presented a method which used the projected conjugate gradient

algorithm to solve the SVM-QP problem, which has been known as chunking. The

chunking algorithm uses the fact that the value of the quadratic form is the same if

you remove the rows and columns of the matrix that corresponds to zero Lagrange

multipliers. Therefore, chunking seriously reduces the size of the matrix from the


- 150 -

number of training examples squared to approximately the number of non-zero

Lagrange multipliers squared. However, chunking still cannot handle large-scale

training problems, since even this reduced matrix cannot fit into memory. Osuna,

Freund and Girosi [171] presented the improved training algorithm which suggests a

whole new set of QP algorithms for SVM. The theorem proves that the large QP

problem can be broken down into a series of smaller QP sub-problems.

Sequential Minimum Optimization (SMO) for SVM-QP Problem

Sequential minimal optimization (SMO) proposed by Platt [163] is a simple

algorithm that can be used to solve the SVM-QP problem without any additional

matrix storage and without using the numerical QP optimization steps. This method

decomposes the overall QP problem into QP sub-problems using the Osuna’s

theorem to ensure convergence. In this dissertation, SMO is used as a solver and the

detail of SMO is readily available in reference [163].

In order to solve the two Lagrange multipliers , , SMO first computes the

constraints on these multipliers and then solves for the constrained minimum. For

convenience, all quantities that refer to the first multiplier will have a subscript 1,

while all quantities that refer to the second multiplier will have a subscript 2. The

new values of these multipliers must lie on a line in , space, and in the box

defined by 0 , .

(A.16)

Without loss of generality, the algorithm first computes the second Lagrange

multipliers and successively uses it to obtain . The box constraint

0 , , together with the linear equality constraint ∑ 0, provides a

more restrictive constraint on the feasible values for . The boundary of feasible

region for can be applied as follows

, 0, , , , (A.17)

, 0, , , ,

(A.18)


- 151 -

The second derivative of the objective function along the diagonal line can be

expressed as:

, , 2 , (A.19)

Under normal circumstances, the objective function will be positive definite, there

will be a minimum along the direction of the linear equality constraint, and will

be greater than zero. In this case, SMO computes the minimum along the direction of

the constraint:

(A.20)

where is the prediction error on the ith training example. As a next step, the

constrained minimum is found by clipping the unconstrained minimum to the ends of

the line segment:

,H if H;

; L if ;

(A.21)

Now, let . The value of is computed from the new :

(A.22)

Solving Eq. (A.9) for the Lagrange multipliers does not determine the threshold

of the SVM, so must be computed separately. The following thresholds ,

are valid when the new , are not at the each bounds, because it forces the

output of the SVM to be , when the input is , respectively

, , ,

(A.23)

, , ,

(A.24)

When both and are valid, they are equal. When both new Lagrange

multipliers are at bound and if is not equal to , then the interval between


- 152 -

and are all thresholds that are consistent with the Karush-Kuhn-Tucker

conditions which are necessary and sufficient conditions for an optimal point of a

positive definite QP problem. In this case, SMO chooses the threshold to be halfway

between and [163].


- 153 -

REFERENCES

[1] K. Holmberg, A. Helle, and J. Halme, "Prognostics for Industrial Machinery Availability," Maintenance, Condition Monitoring and Diagnostics, POHTO 2005 International Seminar, 2005.

[2] C. P. Henry, "Turbomachinery Condition Monitoring and Failure Prognosis," Sound and Vibration, vol. 41, p. 10, 2007.

[3] K. Komonen, "A cost model of industrial maintenance for profitability analysis and benchmarking," International Journal of Production Economy, vol. 79, pp. 5-31, 2002.

[4] R. H. P. M. Arts, G. M. Knapp, and L. J. Mann, "Some aspects of measuring maintenance performance in the process industry," Journal of Quality in Maintenance Engineering, vol. 4, pp. 6-11, 1998.

[5] P. De Groote, "Maintenance performance analysis: a practical approach," Journal of Quality in Maintenance Engineering, vol. 1, pp. 4-24, 1995.

[6] R. Kothamasu, S. H. Huang, and W. H. VerDuin, "System health monitoring and prognostics a review of current paradigms and practices," International Journal of Advanced Manufacturing Technology, vol. 28, pp. 1012-1024, 2006.

[7] J. Moubray, Reliability centered maintenance. New York: Industrial Press Inc., 1997.

[8] R. C. M. Yam, P. W. Tse, L. Li, and P. Tu, "Intelligent Predictive Decision Support System for Condition-Based Maintenance," International Journal of Advanced Manufacturing Technology, vol. 17, pp. 383-391, 2001.


- 154 -

[9] "An Operations and Maintenance Information Open System Alliance" Available at: http://www.mimosa.org/.

[10] A. K. S. Jardine, D. Lin, and D. Banjevic, "A review on machinery diagnostics and prognostics implementing condition-based maintenance," Mechanical Systems and Signal Processing, vol. 20, pp. 1483-1510, 2006.

[11] M. Milfelner, F. Cus, and J. Balic, "An overview of data acquisition system for cutting force measuring and optimization in milling," Journal of Materials Processing Technology AMPT/AMME05 Part 2, pp. 1281-1288, 2005.

[12] C. D. O'Donoghue and J. G. Prendergast, "Implementation and benefits of introducing a computerised maintenance management system into a textile manufacturing company," Journal of Materials Processing Technology, pp. 226-232, 2004.

[13] Godot A., Villard P., and Savournin A., "Implementation of a computerized maintenance management system," Computer Standards & Interfaces, vol. 20, pp. 427, 1999.

[14] A. C. C. Tan and J. Mathew, "The Adaptive Noise Cancellation and Blind Deconvolution Techniques for Detection of Rolling Elements Bearing Faults - A Comparison," in ACSIM Proceedings, 2002.

[15] R. Xu and C. Kwan, "Robust Isolation of Sensor Failures," Asian Journal of Control, vol. 5, pp. 12-23, 2003.

[16] T. Burgess and L. Shimbel, "What is the prognosis on your maintenance program?," E&MJ: Engineering & Mining Journal, vol. 196, pp. 32, 1995.

[17] S. Poyhonen, P. Jover, and H. Hyotyniemi, "Signal processing of vibrations for condition monitoring of an induction motor," in ISCCSP: 2004 First International Symposium on Control, Communications and Signal Processing, New York, 2004, pp. 499–502.

[18] Y. M. Zhan and A. K. S. Jardine, "Adaptive autoregressive modeling of non-stationary vibration signals under distinct gear states. Part 1: modeling," Journal of Sound and Vibration, vol. 286, pp. 429-450, 2005.


- 155 -

[19] D. C. Baillie and J. Mathew, "A comparison of autoregressive modeling techniques for fault diagnosis of rolling element bearings," Mechanical Systems and Signal Processing, vol. 10, pp. 1-17, 1996.

[20] M. J. E. Salami and S. N. Sidek, "Parameter estimation of multicomponent transient signals using deconvolution and arma modelling techniques," Mechanical Systems and Signal Processing, vol. 17, pp. 1201-1218, 2003.

[21] Y. Li, S. Billington, C. Zhang, T. Kurfess, S. Danyluk, and S. Liang, "Adaptive prognostics for rolling element bearing condition," Mechanical Systems and Signal Processing, vol. 13, pp. 103-113, 1999.

[22] D. Kocur and R. Stanko, "Order bispectrum: A new tool for reciprocated machine condition monitoring," Mechanical Systems and Signal Processing, vol. 14, pp. 871-890, 2000.

[23] J. T. Kim and R. H. Lyon, "Cepstral analysis as a tool for robust processing, deverberation and detection of transients," Mechanical Systems and Signal Processing, vol. 6, pp. 1-15, 1992.

[24] Y. LI, J. SHIROISHI, S. DANYLUK, T. KURFESS, and S. Y. LIANG, "Bearing fault detection via high frequency resonance technique and adaptive line enhancer," in Biennial Conference on Reliability, Stress Analysis and Failure Prevention (RSAFP), 1997.

[25] W. J. Wang and P. D. McFadden, "Application of wavelets to gearbox vibration signals for fault detection," Journal of Sound and Vibration, vol. 192, pp. 927-939, 1996.

[26] G. Y. Luo, D. Osypiw, and M. Irle, "On-line vibration analysis with fast continuous wavelet algorithm for condition monitoring of bearing," Journal of Vibration and Control, vol. 9, pp. 931-947, 2003.

[27] G. O. Chandroth and W. J. Staszewski, "Fault detection in internal combustion engines using wavelet analysis," in COMADEM '99, Chipping Norton, pp. 7-15, 1999.

[28] P. Tse, Y. H. Peng, and R. Yam, "Wavelet analysis and envelope detection for rolling element bearing fault diagnosis-their effectiveness and flexibility," Transactions of the American Society of Mechanical Engineers, Journal of Vibration and Acoustics, vol. 123, pp. 303–310, 2001.


- 156 -

[29] G. Dalpiaz and A. Rivola, "Condition monitoring and diagnostics in automatic machines: Comparison of vibration analysis techniques," Mechanical Systems and Signal Processing, vol. 11, pp. 53-73, 1997.

[30] H. G. Park and M. Zak, "Gray-box approach for fault detection of dynamical systems," Journal of Dynamic Systems, Measurement and Control, vol. 125, pp. 451-454, 2003.

[31] J. Ma and C. J. Li, "Detection of localized defect in rolling element bearing via composite hypothesis test," Mechanical System and Signal Processing, vol. 9, pp. 63-75, 1995.

[32] M. Nyberg, "A General framework for fault diagnosis based on statistical hypothesis testing," in Twelfth International Workshop on Principles of Diagnosis (DX2001), Via Lattea, Italian Alps, pp. 135-142, 2001.

[33] D. L. Iverson, "Inductive System Health Monitoring," in International Conference on Artificial Intelligence, IC-AI 2004, Las Vegas, Nevada, USA, pp. 21-24, 2004.

[34] V. A. Skormin, L. J. Popyack, V. I. Gorodetski, M. L. Araiza, and J. D. Michel, "Application of cluster analysis in diagnostic related problems," in IEEE Aerospace Conference, Snowmass at Aspen, USA, pp. 161-168, 1999.

[35] M. artes, L. D. Castillo, and J. Perez, "Failure prevention and diagnosis in machine elements using cluster," in Tenth International Congress on Sound and Vibration, Stockholm, Sweden, pp. 1197-1203, 2003.

[36] J. Schurmann, Pattern Recognition: A Unified View of Statistical and Neural Approaches. New York: Wiley, 1996.

[37] H. Ding, X. Gui, and S. Yang, "An approach to state recognition and knowledge based diagnosis for engines," Mechanical Systems and Signal Processing, vol. 5, pp. 257–266, 1991.

[38] X. Lou and K. A. Loparo, "Bearing fault diagnosis based on wavelet transform and fuzzy inference," Mechanical Systems and Signal Processing, vol. 18, pp. 1077-1095, 2004.

[39] M.-C. Pan, P. Sas, and H. V. Brussel, "Machine condition monitoring using signal classification techniques," Journal of Vibration and Control, vol. 9, pp. 1103-1120, 2003.


- 157 -

[40] W. J. Staszewski, K. Worden, and G. R. Tomlinson, "Time–frequency analysis in gearbox fault detection using the Wigner–Ville distribution and pattern recognition," Mechanical Systems and Signal Processing, vol. 11, pp. 673–692, 1997.

[41] C. K. Mechefske and J. Mathew, "Fault detection and diagnosis in low speed rolling element bearing. Part II: The use of nearest neighbour classification," Mechanical Systems and Signal Processing, vol. 6, pp. 309–316, 1992.

[42] Q. Sun, P. Chen, D. Zhang, and F. Xi, "Pattern recognition for automatic machinery fault diagnosis," Journal of Vibration and Acoustics, Transactions of the ASME, vol. 126, pp. 307–316.

[43] L. B. Jack and A. K. Nandi, "Fault detection using support vector machine and artificial neural network, augmented by genetic algorithm," Mechanical System and Signal Processing vol. 16, pp. 373-390, 2002.

[44] S. Pöyhönen, M. Negrea, A. Arkkio, H. Hyotyniemi, and H. Koivo, "Coupling pairwise support vector machines for fault classification," Control Engineering Practice, vol. 13, pp. 759-769, 2005.

[45] J. Sun, G. S. Hong, M. Rahman, and Y. S. Wong, "The application of nonstandard support vector machine in tool condition monitoring system," in 2nd IEEE International Workshop on Electronic Design, Test and Applications, pp. 1-6, 2004.

[46] J. Sun, M. Rahman, Y. S. Wong, and G. S. Hong, "Multiclassification of tool wear with support vector machine by manufacturing loss consideration," International Journal Machine Tools & Manufacture vol. 44, pp. 1179-1187, 2004.

[47] D. M. J. Tax, A. Ypma, and R. P. W. Duin, "Pump failure determination using support vector data description," Lecture Notes in Computer Science, pp. 415-425, 1999.

[48] F. He and W. Shi, "WPT-SVMs Based Approach for Fault Detection of Valves in Reciprocating Pumps," in Proceedings of the American Control Conference, 2002.

[49] S. M. Namburu, S. Chigusa, D. Prokhorov, Q. Liu, C. Kihoon, and K. Pattipati, "Application of an Effective Data-Driven Approach to Real-time


- 158 -

time Fault Diagnosis in Automotive Engines," in Aerospace Conference, 2007 IEEE, pp. 1-9, 2007.

[50] M. L. Fugate, H. Sohn, and C. R. Farrar, "Vibration-based damage detection using statistical process control," Mechanical System and Signal Processing, vol. 15, pp. 707-721, 2001.

[51] A. Pryor, M. Mosher, and D. Lewicki, "The Application of Time-Frequency Methods to HUMS," in American Helicopter Society, Washington, D.C., 2001.

[52] I. Y. Tumer and E. M. Huff, "Analysis of Triaxial Vibration Data for Health Monitoring of Helicopter Gearboxes," Journal of Vibration and Acoustics, vol. 125, pp. 120-128, 2003.

[53] M.-Y. C. B. Li, Y. Tipsuwan, J.C. Hung, "Neural-network-based motor rolling bearing fault diagnosis," IEEE Transactions on Industrial Electronics, vol. 47, pp. 1060–1069, 2000.

[54] Y. Fan and C. J. Li, "Diagnostic rule extraction from trained feedforward neural networks," Mechanical Systems and Signal Processing, vol. 16, pp. 1073–1081, 2002.

[55] E. C. Larson, D. P. Wipf, and B. E. Parker, "Gear and bearing diagnostics using neural network-based amplitude and phase demodulation," in the 51st Meeting of the Society for Machinery Failure Prevention Technology, Virginia Beach, VA, pp. 511–521, 1997.

[56] M. J. Roemer, C. Hong, and S. H. Hesler, "Machine health monitoring and life management using finite element-based neural networks," Engineering for Gas Turbines and Power—Transactions of the ASME vol. 118, pp. 830–835, 1996.

[57] B. A. Paya, I. I. Esat, and M. N. M. Badi, "Artificial neural network based fault diagnostics of rotating machinery using wavelet transforms as a preprocessor," Mechanical Systems and Signal Processing, vol. 11, pp. 751–765, 1997.

[58] B. Samanta and K. R. Al-Balushi, "Artificial neural network based fault diagnostics of rolling element bearings using time-domain features," Mechanical Systems and Signal Processing, pp. 317–328, 2003.


- 159 -

[59] J. K. Spoerre, "Application of the cascade correlation algorithm (CCA) to bearing fault classification problems," Computers in Industry, vol. 32, pp. 295–304, 1997.

[60] D. W. Dong, J. J. Hopfield, and K. P. Unnikrishnan, "Neural networks for engine fault diagnostics," in Neural Networks for Signal Processing VII, New York, pp. 636–644, 1997.

[61] C. J. Li and T. Y. Huang, "Automatic structure and parameter training methods for modeling of mechanical systems by recurrent neural networks," Applied Mathematical Modelling, vol. 23, pp. 933–944, 1999.

[62] P. Deuszkiewicz and S. Radkowski, "On-line condition monitoring of a power transmission unit of a rail vehicle," Mechanical Systems and Signal Processing, vol. 17, pp. 1321–1334, 2003.

[63] C.-C. Wang and G.-P. J. Too, "Rotating machine fault detection based on HOS and artificial neural networks," Intelligent Manufacturing, vol. 13, pp. 283–293, 2002.

[64] R. M. Tallam, T. G. Habetler, and R. G. Harley, "Self-commissioning training algorithms for neural networks with applications to electric machine fault diagnostics," IEEE Transactions on Power Electronics vol. 17, pp. 1089–1095, 2002.

[65] H. Sohn, K. Wordwn, and C. R. Farrar, "Statistical damage classification under changing environmental and operation conditions," Intelligent Material System and Structures, vol. 13, pp. 561-574, 2002.

[66] M. Schwabacher, "Machine Learning for Rocket Propulsion Health Monitoring," in Proceedings of the SAE World Aerospace Congress, Dallas, TX, USA, 2005.

[67] N. Oza, K. Tumer, I. Tumer, and E. Huff, "Classification of Aircraft Maneuvers for Fault Detection," Lecture Notes in Computer Science, vol. 2709, pp. 375-384, 2003.

[68] D. L. Hall, R. J. Hansen, and D. C. Lang, "The negative information problem in mechanical diagnostics," Transactions of the ASME Journal of Engineering for Gas Turbines and Power, vol. 119, pp. 370–377, 1997.


- 160 -

[69] M. Stanek, M. Morari, and K. Frohlich, "Model-aided diagnosis: An inexpensive combination of model-based and case-based condition assessment," IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, pp. 137–145, 2001.

[70] H. Park, R. Mackey, M. James, M. Zak, M. Kynard, J. Sebghati, and W. Greene, "Analysis of Space Shuttle Main Engine Data Using Beacon-based Exception Analysis for Multi-Missions," in Proceedings of the IEEE Aerospace Conference, New York, USA, pp. 2835-2844, 2002.

[71] A. Srivastava, "Discovering System Health Anomalies Using Data Mining Techniques," in Proceedings of the Joint Army Navy NASA Air Force Conference on Propulsion, Charleston, SC, USA, 2005.

[72] A. Bakhtazad, A. Palazoglu, and J. A. Romagnoli, "Detection and classification of abnormal process situations using multidimensional wavelet domain hidden Markov trees," Computers and Chemical Engineering, vol. 24, p. 769, 2000.

[73] P. Smyth, "Hidden Markov Models for fault detection in dynamic systems," Pattern Recognition, vol. 27, 1994.

[74] B. C. Williams and P. P. Nayak, "A Model-based Approach to Reactive Self-Configuring Systems," in Proceedings of the National Conference on Artificial Intelligence, Menlo Park, CA, USA, 1996.

[75] G. Aaseng, K. Cavanaugh, and S. Deb, "An Intelligent Remote Monitoring Solution for the International Space Station," in Proceedings of the IEEE Aerospace Conference, New York, USA, 2003.

[76] I. Howard, S. Jia, and J. Wang, "The dynamic modelling of a spur gear in mesh including friction and a crack," Mechanical Systems and Signal Processing, vol. 15, pp. 831-838, 2001.

[77] W. Y. Wang, "Towards dynamic model-based prognostics for transmission gears," in Component and Systems Diagnostics, Prognostics, and Health Management II, Bellingham, pp. 157-167, 2002.

[78] D. C. Baillie and J. Mathew, "Nonlinear model-based fault diagnosis of bearings," in Proceedings of an International Conference on Condition Monitoring, Swansea, UK, pp. 241-252, 1994.


- 161 -

[79] K. A. Loparo, A. H. Falah, and M. L. Adams, "Model-based fault detection and diagnosis in rotating machinery," in Proceedings of the Tenth International Congress on Sound and Vibration,, Stockholm, Sweden, pp. 1299-1306, 2003.

[80] C. H. Oppenheimer and K. A. Loparo, "Physically based diagnosis and prognosis of cracked rotor shafts," Component and Systems Diagnostics, Prognostics, and Health Management II, vol. 4733, pp. 122-132, 2002.

[81] A. S. Sekhar, "Model-based identification of two cracks in a rotor system," Mechanical Systems and Signal Processing, vol. 18, pp. 977-983, 2004.

[82] D. J. Inman, C. R. Farrar, V. L. Junior, and V. S. Junior, Damage Prognosis: For Aerospace, Civil and Mechanical Systems: John Wiley and Sons, 2005.

[83] I. 13381-1, "Condition monitoring and diagnostics of machines - Prognostics - Part 1: General guidelines ": International standards organization, 2004.

[84] J. Neter, M. H. Kutner, C. J. Nachtsheim, and W. Wasserman, Applied Linear Statistical Models: Irwin, 1996.

[85] Y. Li, C. Zhang, T. R. Kurfess, S. Dahyluk, and S. Y. Liang, "Diagnostics and prognostics of a single surface defect on roller bearings," in Proceedings on the Institution of Mechanical Engineers, pp. 1173-1185, 2000.

[86] J. Yan, M. Koc, and J. Lee, "A prognostic algorithm for machine performance assessment and its application," Production Planning and Control Engineering Practice, vol. 15, pp. 796-801, 2004.

[87] R. Jardim-Goncalves, M. Martins-Barata, J. Assis-Lopes, and A. Steiger-Garcao, "Application of stochastic modelling to support predictive maintenance for industrial environments," IEEE International Conference on Systems Man and Cybernetics, vol. 1, pp. 117-122, 1996.

[88] R. Patankar and A. Ray, "State-space modeling of fatigue crack growth in ductile alloys," Engineering Fracture Mechanics, vol. 66, pp. 129-151, 2000.

[89] W. Wang and A. Wong, "Autoregressive model based gear fault diagnosis," Journal of Vibration and Acoustics, vol. 124, pp. 172-179, 2002.

[90] Q. Zhang, M. Basseville, and A. Benveniste, "Early warning of slight changes in systems," Automatica, vol. 30, pp. 95-114, 1994.


- 162 -

[91] C. J. Lu and W. Q. Meeker, "Using degradation measures to estimate a time-to-failure distribution," Technometrics, vol. 35, pp. 161-174, 1993.

[92] V. Chan and W. Q. Meeker, "Estimation of degradation-based reliability in out-door environments," Iowa State University.

[93] C. Li and A. Ray, "Neural network representation of fatigue damage dynamics," Journal of Fatigue Damage Dynamics, pp. 126-133, 1995.

[94] Y. Shao and K. Nezu, "Prognosis of remaining bearing life using neural networks," in Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, pp. 217-230, 2000.

[95] P. Wang and G. Vachtsevanos, "Fault prognosis using dymamic wavelet neural networks," Maintenance and Reliability Conference. MARCON, 1999.

[96] N. Gebraeel, M. Lawley, R. Liu, and V. Parmeshwaran, "Residual life predictions from vibration-based degradation signals: a neural network approach," IEEE Transactions on Industrial Electronics, vol. 51, pp. 694- 700, 2004.

[97] R. Huang, L. Xi, X. Li, C. Richard Liu, H. Qiu, and J. Lee, "Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods," Mechanical Systems and Signal Processing, vol. In Press, Corrected Proof.

[98] P. Wang and G. Vachtsevanos, "Fault prognostics using dynamic wavelet neural networks," AI EDAM, vol. 15, pp. 349-365, 2001.

[99] P. Tse and D. Atherton, "Prediction of machine deterioration using vibration based fault trends and recurrent neural networks," Transactions of the ASME: Journal of Vibration and Acoustics, vol. 121, pp. 355–362, 1999.

[100] R. C. M. Yam, P. W. Tse, L. Li, and P. Tu, "Intelligent Predictive Decision Support System for CBM," The International Journal of Advanced Manufacturing Technology, vol. 17, pp. 383-391, 2001.

[101] Y.-L. Dong, G. Yu-Jiong, and K. Yang, "Research on the Condition Based Maintenance Decision of Equipment in Power Plant," in Machine Learning and Cybernetics, 2004.


- 163 -

[102] R. Ramesh, M. A. Mannan, A. N. Poo, and S. S. Keerthi, "Thermal error measurement and modeling in machine tools, Part II. Hybrid Bayesian network-support vector machine model," International Journal of Machine Tools & Manufacture, vol. 43, pp. 405-419, 2003.

[103] C. S. Byington, M. Watson, and D. Edwards, "Dynamic Signal Analysis and Neural Network Modeling for Life Prediction of Flight Control Actuators," in Proceedings of the American Helicopter Society 60th Annual Forum, Alexandria, VA, USA.

[104] C. S. Byington, M. Watson, and D. Edwards, "Data-Driven Neural Network Methodology to Remaining Life Predictions for Aircraft Actuator Components," in IEEE Aerospace Conference, Big Sky, MT, USA, 2004.

[105] W. Q. Wang, M. F. Golnaraghi, and F. Ismail, "Prognosis of machine health condition using neuro-fuzzy systems," Mechanical Systems and Signal Processing, vol. 18, pp. 813-831, 2004.

[106] J. R. Bock, T. Brotherton, P. Grabill, D. Gass, and J. A. Keller, "On False Alarm Mitigation," in Proceedings of the IEEE Aerospace Conference, New York, USA.

[107] D. Clifton, "Condition Monitoring of Gas-Turbine Engines," Department of Engineering Science, University of Oxford, 2006.

[108] L. Déchamp, A. Dutech, T. Montroig, X. Qian, D. Racoceanu, I. Rasovska, B. Brézillon, F. Charpillet, J. Y. Jaffray, N. Moine, B. S. Morello, Müller, G. Nguengang, N. Palluat, and L. Pelissier, "On the Use of Artificial Intelligence for Prognosis and Diagnosis in the PROTEUS E-maintenance platform," 2005.

[109] C. Cempel, H. G. Natke, and M. Tabaszewski, "A passive diagnostic experiment with ergodic properties," Mechanical Systems and Signal Processing, vol. 11, pp. 107-117, 1997.

[110] G. Gottlieb, "Failure Distrubutions of Shock Models," Journal of Applied Probability, vol. 17, pp. 745-752, 1980.

[111] J. G. Shanthikumar and U. Sumita, "General Shock Models Associated with Correlated Renewal Sequences," Journal of Applied Probability, vol. 20, pp. 600-614, 1983.


- 164 -

[112] J. Qiu, B. B. Set, S. Y. Liang, and C. Zhang, "Damage mechanics approach for bearing lifetime prognotics," Mechanical Systems and Signal Processing, vol. 16, pp. 817-829, 2002.

[113] D. E. Adams, "Nonlinear damage mdels for diagnosis and pognosis in structual dynamic systems," SPIE Conference Proceeding, vol. 4733, pp. 180-191, 2002.

[114] D. Chelidze, "Multimode damage tracking and failure prognosis in electromechanical system," SPIE Conference Proceeding, vol. 4733, pp. 1-12, 2002.

[115] D. Banjevic and A. K. S. Jardine, "Calculation of reliability function and remaining useful life for a Markov failure time process," IMA Journal of Management Mathematics, vol. 17, pp. 115-130, 2006.

[116] R. B. Chinnam and P. Baruah, "Autonomous diagnostics and prognostics through competitive learning driven HMM-based clustering," in Proceedings of the International Joint Conference on Neural Networks, pp. 2466- 2471, 2003.

[117] D. Swanson, "A general prognostic tracking algorithm for predictive maintenance," IEEE International Conference on Aerospace in Procceding, vol. 6, pp. 2971-2969, 2001.

[118] A. T. Ray, S., "Stochastic modeling of fatigue crack dynamics for on-line failure prognostics," IEEE Transactions on Control Systems Technology, vol. 4, pp. 443-451, 1996.

[119] A. Ray and S. Tangirala, "A nonlinear stochastic model of fatigue crack dynamics," Probabilistic Engineering Mechanics, vol. 12, pp. 33-40, 1997.

[120] M. Chao, "Degradation analysis and related topics: Some thoughts and a review," in Proceedings of the National Science Council ROC(A), pp. 555-565, 1999.

[121] D. Kumar and U. Westberg, "Maintenance scheduling under age replacement policy using proportional hazards model and TTTplotting," European Journal of Operational Research, vol. 99, pp. 507-515, 1997.


- 165 -

[122] S. Gasmi, C. E. Love, and W. Kahle, "A general repair, proportional-hazards, framework to model complex repairable systems," IEEE Transactions on Reliability, vol. 52, pp. 26-32, 2003.

[123] T. E. Tallian, "Data fitted bearing life prediction model for variable operating conditions," Tribology Transactions, vol. 42, p. 241, 1999.

[124] A. K. S. Jardine, D. Lin, and D. Banjevic, "A review on machinery diagnostics and prognostics implementing condition-based maintenance," Mechanical Systems and Signal Processing, vol. 20, pp. 1483-1510, 2006.

[125] P.-J. Vlok, M. Wnek, and M. Zygmunt, "Utilising statistical residual life estimates of bearings to quantify the influence of preventive maintenance actions," Mechanical Systems and Signal Processing, vol. 18, pp. 833-847, 2004.

[126] D. Banjevic and A. K. S. Jardine, "Calculation of reliability function and remaining useful life for a Markov failure time process," IMA Journal of Management Mathematics, 2005.

[127] "Prognostics Center of Excellence Data Repository web site," NASA Ames Research Center.

[128] C. Kwan, X. Zhang, R. Xu, and L. Haynes, "A Novel Approach to Fault Diagnostics and Prognostics," in lEEE lnternational Conference on Robotics & Automation, Taipei, Taiwan, pp. 14-19, 2003.

[129] G. Vachtsevanos, F. Lewis, M. Roemer, A. Hess, and B. Wu, Intelligent fault diagnosis and prognosis for engineering systems. New Jersy: John Wiley & Sons, Inc., 2006.

[130] H. Qiu, J. Lee, D. Djudjanovic, and J. Ni, "ADVANCES ON PROGNOSTICS FOR INTELLIGENT MAINTENANCE SYSTEMS," 2005.

[131] J. B. Leger, B. Iung, and G. Morel, "Integrated design of prognosis, diagnosis and monitoring processes for proactive maintenance of manufacturing systems," in Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference, vol.4, pp. 492-498, 1999.

[132] J. Lee, J. Ni, D. Djurdjanovic, H. Qiu, and H. Liao, "Intelligent prognostics tools and e-maintenance," Computers in Industry, vol. 57, pp. 476-489, 2006.


- 166 -

[133] C. S. Byington, M. J. Roemer, and M. J. Watson, "PROGNOSTIC ENHANCEMENTS TO DIAGNOSTIC SYSTEMS (PEDS) APPLIED TO SHIPBOARD POWER GENERATION SYSTEMS," in Proceedings of ASME Turbo Expo 2004 Power for Land, Sea, and Air, Vienna, Austria, 2004.

[134] J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik, "Feature selection for SVMs," in the Proceedings of the Advances in Neural Information Processing Systems, pp. 526-532, 2000.

[135] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," Journal of Machine Learning Research, vol. 3, pp. 1157-1182.

[136] D. Djurdjanovica, J. Lee, and J. Nia, "Watchdog Agent-an infotronics-based prognostics approach for product performance degradation assessment and prediction," Advanced Engineering Informatics, vol. 17, pp. 109–125, 2003.

[137] M. Pal and P. M. Mather, "Assessment of the effectiveness of support vector machines for hyperspectral data," Future Generation Computer Systems, vol. 20, pp. 1215-1225, 2004.

[138] G. Niu, J. D. Son, A. Widodo, B. S. Yang, D. H. Hwang, and D. S. Kang, "A comparison of classifier performance for fault diagnosis of induction motor using multi-type signals," Technical Note of Structural Health Monitoring, vol. 6, pp. 215- 229, 2007.

[139] Y. Weizhong and X. Feng, "Jet engine gas path fault diagnosis using dynamic fusion of multiple classifiers," in Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on, pp. 1585-1591, 2008.

[140] G. Niu, T. Han, B. S. Yang, and A. C. C. Tan, "Multi-agent decision fusion for motor fault diagnosis," Mechanical Systems and Signal Processing, vol. 21, 2007.

[141] V. N. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, 1995.

[142] L. M. He, F. S. Kong, and Z. Q. Shen, "Multiclass SVM based on land cover classification with multisource data," in Fourth International Conference on Machine Learning and Cybernetics, pp. 3541-3545, 2005.


- 167 -

[143] J. C. Platt, N. Cristianini, and J. Shawe-Taylor, "Large margin DAG's for multiclass classification," Advances in Neural Information Processing Systems, vol. 12, pp. 547-553, 2000.

[144] C. W. Hsu and C. J. Lin, "A comparison of methods for multiclass support vector machines," IEEE Transactionon Neural Network, vol. 13, pp. 415-425, 2002.

[145] J. I. Taylor, The vibration analysis handbook. Tampa, FL: Vibration Consultants Inc, 1994.

[146] C. Kral, T. G. Habetler, R. G. Harley, F. Pirker, G. Pascoli, H. Oberguggenberger, and C. J. M. Fenz, "A Comparison of Rotor Fault Detection Techniques with Respect to the Assessment of Fault Severity," in SDEMPED 2003, Symposium on Diagnostics for Electric Machines, Power Electronics and Drives, Atlanta, GA, USA, pp. 265-270, 2003.

[147] G. G. Acosta, C. J. Verucchi, and E. R. Gelso, "A current monitoring system for diagnosing electrical failures in induction motors," Mechanical Systems and Signal Processing, vol. 20, pp. 953-965, 2006.

[148] M. Abuzaid, M. Eleshaky, and M. Zedan, "Effect of partial rotor-to-stator rub on shaft vibration," Journal of Mechanical Science and Technology, vol. 23, pp. 170-182, 2009.

[149] W. W. Hwang and B. S. Yang, "Fault diagnosis of rotating machinery using multi-class support vector machines," Korea Society for Noise and Vibration Engineering, vol. 14, pp. 1233-1240, 2004.

[150] S. Knerr, L. Personnaz, and G. Dreyfus, Single-layer learning revisited: a stepwise procedure for building and training a neural network. New York: Springer-Verlag, 1990.

[151] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, pp. 123-140, 1996.

[152] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and regression trees. Belmont, California: Wadsworth, 1984.

[153] T. K. Ho, "The random subspace method for constructing decision forests," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 832 - 844, 1998.


- 168 -

[154] C. M. Bishop, Neural Networks for Pattern Recognition. Oxford: Clarendon Press, 1995.

[155] S. Chen, X. Hong, and C. J. Harris, "Orthogonal Forward Selection for Constructing the Radial Basis Function Network with Tunable Nodes," in Advances in Intelligent Computing, pp. 777-786, 2005.

[156] J. Mandel, "Use of the Singular Value Decomposition in Regression Analysis," The American Statistician, vol. 36, pp. 15-24, 1982.

[157] A. Heng, A. C. C. Tan, J. Mathew, and B. S. Yang, "Machine prognosis with full utilization of truncated lifetime data," in 2nd World Congress on Engineering Asset Management and the 4th International Conference on Condition Monitoring, Harrogate, UK, pp. 775-784, 2007.

[158] S. Braun and B. Datner, "Analysis of roller/ball bearing vibrations," Transactions of the ASME, vol. 101, pp. 118-125, 1979.

[159] P. D. McFadden and J. D. Smith, "Model for the vibration produced by a single point defect in a rolling element bearing," Journal of Sound and Vibration, vol. 96, pp. 69-82, 1984.

[160] T. A. Harris, Rolling Bearing Analysis. New York: John Wiley, 1966.

[161] D. Mandic and J. Chambers, Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability: John Wiley and Sons Ltd., 2001.

[162] N. Gebraeel, M. Lawley, R. Liu, and V. Parmeshwaran, "Residual life predictions from vibration-based degradation signals: a neural network approach," IEEE Transactions on Industrial Electronics, vol. 51, pp. 694-700, 2004.

[163] J. Platt, Fast training of support vector machines using sequential minimal optimization. Cambridge: MIT Press, 1999.

[164] C. W. Hsu, C. C. Chang, and C. J. Lin, "A practical guide to support vector classification, in Technical Report, Department of Computer Science and Information Engineering," National Taiwan University, 2005.

[165] D. R. Cox, "Regression Models and Life Tables (with Discussion)," Journal of the Royal Statistical Society, vol. 34, pp. 187-220, 1972.


- 169 -

[166] H. Liao, W. Zhao, and H. Guo, "Predicting remaining useful life of an individual unit using proportional hazards model and logistic regression model," in Reliability and Maintainability Symposium, 2006. RAMS '06. Annual, 2006.

[167] D. Cai, X. He, and J. Han, "Learning with Tensor Representation," University of Illinois, 2006.

[168] A. Widodo and B.-S. Yang, "Support vector machine in machine condition monitoring and fault diagnosis," Mechanical Systems and Signal Processing, vol. 21, pp. 2560-2574, 2007.

[169] N. Cristianini and N. J. Shawe-Taylor, An Introduction to Support Vector Machines. Cambridge: Cambridge University Press, 2000.

[170] V. N. Vapnik, Estimation Dependences Based on Empirical Data. Berlin: Springer Verlag, 1982.

[171] E. Osuna, R. R. Freund, and F. F. Girosi, "Improved training algorithm for support vector machines," in Proceeding of IEEE Neural Networks for Signal Processing, pp. 276-285, 1997.

Documents

MACHINE PROGNOSTICS BASED ON HEALTH STATE …eprints.qut.edu.au/41739/1/Hack-Eun_Kim_Thesis.pdf · 3.4.3 Prediction of Machine Remnant Life ... and prognostics system based on health