41
10/02/2015

20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Embed Size (px)

DESCRIPTION

20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Citation preview

Page 1: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

10/02/2015

Page 2: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Scheme of work and assessment plan - update

DateTopic

Assessment17:45h - 19:15h 19:30h - 21:00h

03/02 Introduction to Unit Failure and breakdown

10/02 Failure and breakdown Monitoring

24/02 Monitoring Unit 24 Assignment Review Out03/03 Data analysis Data analysis

10/03 Condition monitoring Vibration

17/03 Leak detection Corrosion and Crack detection

24/03 Temperature Assignment Review

31/03 Easter break

07/04 Easter break

12/04 Assessment submission deadline (via Moodle) In14/04 Fault Analysis starts

16/04 Feedback

Page 3: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Failure and breakdown

Degradation due to:

• Corrosion

• Cracking

• Fouling

• Wear

• Ageing

• Maloperation

• Environmental effects

• Operational and maintenance considerations

Statistical analysis of failure rates on plant and equipment

Page 4: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Monitoring

Arrangements and measured parameters:

• ‘online’ and ‘offline’ monitoring

• fixed and portable monitoring equipment

• continuous and semi-continuous data recording

• stress analysis

Page 5: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Page 6: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

A failure is a permanent interruption of a system’s ability to perform a required function under specified operating conditions.

Malfunctions may not result in failures.

Failure rate is the frequency with which an engineered system or component fails, expressed, for example, in failures per hour. It is often denoted by the Greek letter λ (lambda) and is important in reliability engineering.

The failure rate of a system usually depends on time, with the rate varying over the life cycle of the system.

For example, an automobile's failure rate in its fifth year of service may be many times greater than its failure rate during its first year of service. One does not expect to replace an exhaust pipe, overhaul the brakes, or have major transmission problems in a new vehicle.

Page 7: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Spectrum of Predictive MaintenanceEquipment Category

Equipment Types Failure Mode Failure Cause Detection Method

Rotating Machinery

Pumps, Motors, Compressors, Blowers

Premature Bearing Loss Excessive Force Vibration and Lube

Analysis

Lubrication Failure

Over, Under or Improper Lube; Heat and Moisture

Spectrographic & Ferrographic Analysis

Electrical Equipment

Motors, Cable, Starters, Transformers

Insulation Failure Heat, Moisture Time/Resistance Tests, I/R Scans Oil Analysis

Corona Discharge Moisture, Splice Methods Ultrasound

Heat Transfer Equipment

Exchangers, Condensers Fouling Sediment/Material

Buildup Heat Transfer Calculations

Containment and Transfer Equipment

Tanks, Piping, Reactors

Corrosion Chemical Attack Corrosion Meters, Thickness Checks

Stress cracks Metal Fatigue Acoustic Emission

Page 8: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Failure rate with time:

Page 9: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Failure rate with time:

Page 10: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Failure rate with time:

Page 11: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Failure rate with time:

Page 12: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Infant Mortality Period:Initially there are a large number of failures, called initial failures or infant mortality. These failures are primarily due to manufacturing defects, such as weak parts, poor soldering, bad assembly, poor fits, etc. Defective units are detected during the initial failure period, which is characterized by decreasing failure rate. Many manufacturers provide a "debugging" of "burn-in" period for their product, prior to delivery, which helps to eliminate a high portion of the initial failures and assist in establishing a high level of operational reliability.

Page 13: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Useful Life Period:After initial failures, for a long period of time of operation, fewer failures are reported but it is difficult to determine their cause. They occur primarily due to changes in the working stresses or environment conditions. It is difficult to predict the amplitude of stress variations and their time of occurrence; thus, the failures during this period of normal operation are classified as random failures. This period of normal operation is characterized by a constant failure rate (constant number of failures per unit time).

Page 14: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Wear out Period:As time passes, the units begin to deteriorate due to ageing. A gradual change in the performance of the unit is the result. When the performance goes beyond the permissible limit, the unit fails. This region is called wear-out region. The changes are reversible physical-chemical in nature and the prediction of wear-out failures is very difficult. In this period, the failure rate increases.

Wear-out failures are primarily due to deterioration of the design strength of the device as a consequence of operation and exposure to environment fluctuations.

Deterioration (degradation) results from a number of common chemical and physical phenomena:

• Corrosion or oxidation• Insulation breakdown or leakage• Ionic migration of metal in vacuum or on surface• Frictional wear or fatigue

Page 15: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

In practice, the Mean Time Between Failures (MTBF, 1/λ) is often reported instead of the

failure rate.

This is valid and useful if the failure rate may be assumed constant – often used for

complex units / systems, electronics – and is a general agreement in some reliability

standards (Military and Aerospace).

It does in this case only relate to the flat region of the bathtub curve, also called the

"useful life period". Because of this, it is incorrect to extrapolate MTBF to give an estimate

of the service life time of a component, which will typically be much less than suggested

by the MTBF due to the much higher failure rates in the "end-of-life wearout" part of the

"bathtub curve".

Page 16: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

The reason for the preferred use for MTBF numbers is that the use of large positive

numbers (such as 2000 hours) is more intuitive and easier to remember than very small

numbers (such as 0.0005 per hour).

The MTBF is an important system parameter in systems where failure rate needs to be

managed, in particular for safety systems. The MTBF appears frequently in the

engineering design requirements, and governs frequency of required system maintenance

and inspections.

In special processes called renewal processes, where the time to recover from failure can

be neglected and the likelihood of failure remains constant with respect to time, the

failure rate is simply the multiplicative inverse of the MTBF (1/λ).

Page 17: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Another time of interest is the Mean Time To Failure (MTTF). It is the time a device or other product is expected to last in operation. MTTF is one of many ways to evaluate the reliability of pieces of hardware or other technology.

MTTF is extremely similar to MTBF. The difference between them is that while MTBF is used for products than that can be repaired and returned to use, MTTF is used for non-repairable products. When MTTF is used as a measure, repair is not an option.

As a metric, MTTF represents how long a product can reasonably be expected to perform in the field based on specific testing. It is important to note, however, that the mean time to failure metrics provided by companies regarding specific products or components may not have been collected by running one unit continuously until failure. Instead, MTTF data is often collected by running many units, even many thousands of units, for a specific number of hours.

One of the main situations where terms like MTTF are extremely important is when hardware pieces or other products are used in mission-critical systems. Here it becomes valuable to know about general reliability for these items. For non-repairable items, MTTF is a statistic that is of great interest to engineers and others assessing these pieces as parts of larger systems.

Page 18: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Random VariablesIn general, most problems in reliability engineering deal with quantitative measures, such as the time-to-failure of a component, or qualitative measures, such as whether a component is defective or non-defective. We can then use a random variable X to denote these possible measures.

In the case of times-to-failure, our random variable X is the time-to-failure of the component and can take on an infinite number of possible values in a range from 0 to infinity (since we do not know the exact time a priori). Our component can be found failed at any time after time 0 (e.g., at 12 hours or at 100 hours and so forth), thus X can take on any value in this range. In this case, our random variable X is said to be a continuous random variable. In this reference, we will deal almost exclusively with continuous random variables.

In judging a component to be defective or non-defective, only two outcomes are possible. That is, X is a random variable that can take on one of only two values (let's say defective = 0 and non-defective = 1). In this case, the variable is said to be a discrete random variable.

Page 19: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

The Probability Density Function and the Cumulative Distribution Function

The probability density function (PDF) and cumulative distribution function (CDF) are two of the most important statistical functions in reliability and are very closely related. When these functions are known, almost any other reliability measure of interest can be derived or obtained. We will now take a closer look at these functions and how they relate to other reliability measures, such as the reliability function and failure rate.

From probability and statistics, given a continuous random variable X we denote: The probability density function, PDF, as f(x). The cumulative distribution function, CDF, as F(x).

Page 20: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

The PDF and CDF give a complete description of the probability distribution of a random variable. The following figure illustrates a PDF:

Page 21: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

The PDF-CDF relationship:

Page 22: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

If x is a continuous random variable, then the PDF of x is a function, f(x), such that for any two numbers, a and b with a≤b:

That is, the probability that X takes on a value in the interval [a,b] is the area under the density function from a to b as shown above. The PDF represents the relative frequency of failure times as a function of time.

The CDF is a function, F(x), of a random variable X, and is defined for a number x by:

That is, for a number x, F(x) is the probability that the observed value of X will be at most x.

The CDF represents the cumulative values of the PDF. That is, the value of a point on the curve of the CDF represents the area under the curve to the left of that point on the PDF. In reliability, the CDF is used to measure the probability that the item in question will fail before the associated time value, t, and is also called unreliability.

b

a

dxxfbxaP )()(

x

dssfxXPxF0

)()()(

Page 23: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Note that depending on the density function, denoted by f(x), the limits will vary based on the region over which the distribution is defined. For example, for the life distributions considered in this reference, with the exception of the normal distribution, this range would be [0,+∞].

The mathematical relationship between the PDF and CDF is given by:

where s is a dummy integration variable.

Conversely:

x

dssfxF0

)()(

ds

xFdsf

))(()(

Page 24: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Therefore: the CDF is the area under the probability density function up to a value of x. The total area under the PDF is always equal to 1, or mathematically:

1)( dxxf

Page 25: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Reliability Function

The reliability function can be derived using the previous definition of the cumulative distribution function, F(x)1. From our definition of the CDF, the probability of an event occurring by time t is given by:

Or, one could equate this event to the probability of a unit failing by time t.

Since this function defines the probability of failure by a certain time, we could consider this the unreliability function. Subtracting this probability from 1 will give us the reliability function, one of the most important functions in life data analysis.

The reliability function gives the probability of success of a unit undertaking a mission of a given time duration.

x

dssfxF0

)()(1

x

dssftF0

)()(

Page 26: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Reliability Function

The reliability function gives the probability of success of a unit undertaking a mission of a given time duration.

Page 27: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Conditional Reliability FunctionConditional reliability is the probability of successfully completing another mission following the successful completion of a previous mission. The time of the previous mission and the time for the mission to be undertaken must be taken into account for conditional reliability calculations. The conditional reliability function is given by:

Failure Rate FunctionThe failure rate function enables the determination of the number of failures occurring per unit time. Omitting the derivation, the failure rate is mathematically given as:

The Failure Rate Function gives the instantaneous failure rate, also known as the hazard function. It is useful in characterizing the failure behavior of a component, determining maintenance crew allocation, planning for spares provisioning, etc. Failure rate is denoted as failures per unit time.

)(

)(),(

tR

tTRtTR

)(

)()(

tR

tft

Page 28: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Mean Life (MTTF)The mean life function, which provides a measure of the average time of operation to failure, is given by:

This is the expected or average time-to-failure and is denoted as the MTTF (Mean Time To Failure). The MTTF, even though an index of reliability performance, does not give any information on the failure distribution of the component in question when dealing with most lifetime distributions. Because vastly different distributions can have identical means, it is unwise to use the MTTF as the sole measure of the reliability of a component.

Median LifeMedian life is the value of the random variable that has exactly one-half of the area under the PDF to its left and one-half to its right. It represents the centroid of the distribution. The median is obtained by solving the following equation:

(For individual data, the median is the midpoint value.)

0

)( dttftmT

T

dttf

~

0

5.0)(

Page 29: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Statistical analysis of failure rates on plant and equipment

Lifetime DistributionsA statistical distribution is fully described by its PDF.

The most commonly used functions in reliability engineering and life data analysis can be derived from it.

The reliability function, failure rate function, mean time function, and median life function can be determined directly from the PDF definition, or f(t).

Different distributions exist, such as the normal (Gaussian), exponential, Weibull, etc., and each has a predefined form of that can be found in many references. In fact, there are certain references that are devoted exclusively to different types of statistical distributions.

These distributions were formulated by statisticians, mathematicians and engineers to mathematically model or represent certain behavior. For example, the Weibull distribution was formulated by Waloddi Weibull and thus it bears his name. Some distributions tend to better represent life data and are most commonly called "lifetime distributions".

Page 30: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Monitoring

Page 31: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Monitoring

Monitoring:

Arrangements

Measured parameters

‘online’ and ‘offline’ monitoring

fixed and portable monitoring equipment

continuous and semi-continuous data recording

stress analysis

Page 32: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

‘Online’ and ‘offline’ monitoring

Asset Management Condition Monitoring

Using time, energy and resources without wastage and making assets work harder have become core components of our daily lives. Whatever industry we operate in, the direct correlation between uptime and revenue is indisputable.

The challenge is to employ the most effective tools to optimize efficiency and ultimately revenue streams. So what has changed over the last 10-15 years. That's quite simple and probably no surprise- technology has got better.

Page 33: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

‘Online’ and ‘offline’ monitoring

Asset Management Condition Monitoring

Off-Line Slow to warn but most parameters“Manual”

These are the typical Laboratory services that everyone is familiar with. Turnaround has become quicker with time, sample prices have been static for a long while now and additional services such as ferrography more widespread in their adoption.

On-Site Timely warning, good options

These originally started as simple kits giving an indication of water and viscosity. They required manpower and gave basic results. Many tests are available where you can now approach or meet laboratory accuracy using semi-automated equipment. Equipment has also become more intuitive to use.On-Site provides additional testing on the Component’s premises, allowing you to vet a sensor reading quickly and effectively. If more testing is needed you can forward a sample to an off-site lab, however, you may find sufficient information is available upon testing ‘locally’, such that you can take immediate action to avert trauma while still sending the sample out for further testing clarification.

On-Line Quickest warning, more limited parameters

Ten years ago this did not really exist for anything other than flow, temperature and pressure. On-Line is the future technological space to watch.On-Line is the first-alert, primary defensive position, in terms of earliest possible warning.

Page 34: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Fixed and portable monitoring equipment

Page 35: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Fixed and portable monitoring equipment

“Handheld data collectors and analyzers are now commonplace on non-critical or balance of plant machines on which permanent on-line vibration instrumentation cannot be economically justified.

The technician can collect data samples from a number of machines, then download the data into a computer where the analyst (and sometimes artificial intelligence) can examine the data for changes indicative of malfunctions and impending failures.

For larger, more critical machines where safety implications, production interruptions (so-called "downtime"), replacement parts, and other costs of failure can be appreciable (determined by the criticality index), a permanent monitoring system is typically employed rather than relying on periodic handheld data collection.

However, the diagnostic methods and tools available from either approach are generally the same.”

Page 36: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Sensodec 6S - for predictive maintenance in tissue machines

“Avoid unplanned downtime Early failure detection helps avoid unplanned machine downtime, and effectively solves runnability problems. Rolls, bearings, gears and other drive train components produce low-level signals at an early stage when a fault is developing, but is not yet apparent to operators or maintenance personnel.The Sensodec 6S system can immediately detect even these early signs of defects with sensitive high-quality vibration sensors designed for monitoring in a paper machinery environment. Fast measurement cycles, speed-adaptive alarm handling and advanced analysis tools make these signs fully apparent to the personnel. As a result, maintenance actions can be scheduled on time and for the right reasons.”

Page 37: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Case study: Norilsk Nickel

Overview shows that a warning (yellow) in two points in one of the briquetting machines. The 24 month trends show that vibration level in 1000-3000 Hz range has slowly risen compared to the red alarm limit line.

Metso specialist Aarno Keränen’s analysis of this spectrum showed why the vibration level has increased, there are clear bearing fault harmonics around 1300-1900 Hz.

Page 38: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Fixed and portable monitoring equipment

1. Fixed gas monitoring system2. Environmental monitoring systems for the tunnelling

industry3. Fixed biogas and landfill gas analyser4. Modular fixed biogas analyser for continuous monitoring

1 2

3

4

Page 39: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Fixed and portable monitoring equipment

Page 40: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Continuous and semi-continuous data recording

Page 41: 20150130 - Unit 50 Condition Monitoring and Fault Diagnosis - Part 02

Stress analysis

Stress-strength analysis has been used in mechanical component design. The probability of failure is based on the probability of stress exceeding strength. The following equation is used to calculate the expected probability of failure, F:

The expected probability of success or the expected reliability, R, is calculated as:

The equations above assume that both stress and strength are in the positive domain. For general cases, the expected reliability can be calculated using the following equation:

where:

When U = infinity and L = 0, this equation becomes equal to previous equation (i.e., the equation for the expected reliability R).

dxxRxfStrengthStressPF StressStrength )()(][0

dxxRxfStrengthStressPR StrengthsStress )()(][0

dxxRxfLFUF

XXPRU

L

)()()()(

1][ 21

1121

; ; 211 StrengthXStressXU]X[L