17
By: Carlos Mario Perez Jaramillo STRATEGIES FOR LIFE CYCLE RELIABILITY: CONCEPTS & TRENDS ARTICULO

Reliability concepts and trends

Embed Size (px)

DESCRIPTION

Companies seek to ensure and improve their competitiveness through efforts, actions and decisions aimed to ensure operating systems and assets efficiently and effectively; satisfied customers and users; reduced risk; minimum environmental incidents. Cost elements are grouped in today's world, under the word reliability. The article explains the concept of reliability and trends.

Citation preview

  • By: Carlos Mario Perez Jaramillo

    STRATEGIES FOR LIFE CYCLE

    RELIABILITY:

    CONCEPTS & TRENDS

    ARTICULO

  • 2

    RELIABILITY: CONCEPTS & TRENDS

    Author: Carlos Mario Perez Jaramillo

    1. INTRODUCTION Maintenance management has dynamically and permanently evolved. Maintaining involves agreeing with new technological developments, new challenges for industrial, service and agricultural sectors. New challenges are related to the need for optimizing efficiency and efficacy in production of goods and provision of services as well as improving quality and assuring integrity of people and their environment. These requirements have a direct repercussion on maintenance management and have generated evolutionary processes around definition of maintenance techniques and strategies, not only centered in interventions on equipment, but also in an integral management that from a business and systemic perspective, approaches opportune relationship with strategic, administrative, technical and operative work of the maintenance area. As every process undergoing evolution, control of maintenance has followed a series of chronological stages that have been characterized by some specific methodologies. It is advisable to outline that reaching a more advance stage doesnt necessarily mean totally abandoning former methodologies; even if the newer ones lose relevance.

    First generation

    First generation covers the period running until World War II. In those days, companies

    were not too mechanized, and shutdown times didnt matter so much. Assets were

    simple and mostly designed for a specified purpose. These made them reliable and

    easy to maintain. Complicated maintenance systems were not needed, and need for

    qualified staff was lesser than now. Some of the characteristics were:

    Repairing in case of failure.

    Simple equipment.

    Second generation

    Things drastically changed during World War II. War times increased demands of all

    type of products, while labor got a remarkable reduction: that led to the need for an

    increase in process automation. In the 1950s, all type of assets as well as increasingly

    complex were constructed, and companies started depending on them.

  • 3

    Upon increasing this dependence, the assets downtime became more evident and

    important. This situation lead to the idea that failures could and should be totally

    prevented, situation whose output was the birth of the Preventive Maintenance Concept,

    and thats how in the 1960s, maintenance was fundamentally based on complete

    intervention of assets at fixed intervals.

    Maintenance cost started going up considerably regarding other operation costs; as a

    result, planning and maintenance scheduling systems started being introduced in order

    to keep it under control. The main characteristics of this time were and still being, in

    some cases, the following:

    Periodic Interventions

    Costs reduction

    Shutdown reduction

    Systems for intervention planning and scheduling

    Computerization.

    Emphasis on Statistics

    Maintenance carried out by specialties.

    Orientation towards implementation

    Third generation

    Since the mid-1980s, change process in companies has reached vertiginous speeds

    due to the increasingly demands of society, clients, employees and shareholders.

    Continuous automation growth at every scope and high mentioned demands showed

    that failures have effects every time more important on business performance. A

    situation that is clearly evidenced in the trend towards response and flexibility timely

    systems, where optimum inventory levels make that impact of any failure on operation

    may be mitigated, based on reduction of downtimes and affectations on quality and

    services.

    Mechanization and complexity growth of business processes, along with greater risks in

    handling, control and disposition of materials, make failures to cause more harmful

    consequences in security and environment, especially if it happens in a society which is

    increasingly less tolerant.

    Evolution of processes and dynamism of businesses changed paradigms and basic

    credences about maintenance. It is clear that nowadays it is not too relevant to do so

    much, but doing it well. Nowadays it is recognized that theres a minor connection

    between operation time of an asset and its failure possibilities. Reliability is more

  • 4

    recognized as an issue of users satisfaction than a statistic problem and, likewise, the

    concept result is outlined as a preponderant objective instead of control.

    Today theres an intense and dynamic change in concepts, strategies, methods and

    techniques being applied upon maintenance. Some maintenances characteristics of

    this century are:

    Condition Based Monitoring.

    Search of Reliability.

    Design of Reliability and Maintainability

    Risk Analysis.

    Cause /effect analysis.

    Modern decision making systems.

    Integration of Computing and Automation Systems.

    Integration with operations.

    Integrated HHRR who implements, manages, directs and defines strategies.

    Application of Management Models.

    Understanding different failure modes.

    2. RELIABILITY CONCEPT The word Reliability is frequently used now and, unfortunately, sometimes, said use is done ignoring the context and real implication; there are several improvement techniques in asset improvement and, with the use of this word, a constant advertising siege has been developed. The most known concept to define Reliability is: Probability that an asset or system operates without failing during a given period of time, under some operation conditions previously established. Sometimes, this concept is wrongly used, due to the particular use given to the word Failure; for many, Failure only means shutdowns and thus they construct Complex Mathematic Complexes to calculate shutdown probability, without taking into account that theres a failure also when being inefficient, insecure, and costly, when having a high rejection level and, when contributing to a bad image. Other item to be taken into account is shutdown causes that may occur for many reasons and, mixing pears with apples should be avoided... per example, shutdowns due to bearing lubrication with shutdowns due to errors in bearing mounting.

  • 5

    Some have coined the term operational reliability as capacity of an installation or system (integrated by processes, technology and people) to meet its function within its design limits and under a specific operational context. The term operational does not set a clear boundary with reliability concept and, in some companies, this is only limited to measure indexes in order to control "reliability." For others, Reliability is the set of theories and mathematic methods, operative practices and organizational procedures, that being applied upon the study of the Laws of Occurrence of Failures, allows turning to troubleshooting problems as to prevention, estimation, and optimization of survival probability, improvement, average duration and System Proper Operation Time Percentage and they use three ways to state it:

    Desired Time Operation Percent Sometimes stated as: The Asset has 95% reliability in 720 hours planned time, generating confusion in the famous and very used availability concept, or efficiency of desired use of the system, equipment or asset.

    Mean Time Between Failures (MTBF): Sometimes stated as: Mean Time Between Failures for the equipment is 3,000 hours. The cipher is an average (a trend cipher) and its value tries to describe the behavior of a set of data or a sample (times and failures). This term is overvalued by some, thus generalizing the idea that reliability is improved if failure frequency is reduced in a time interval. (Note that failure here is stopped).

    Failure Rate: Sometimes stated as failure percentage in total number of elements or as the number of failures during a given t time: Per example: Batteries have 1% failure rate during one year guarantee period.

    2.1 . Is Response Statistic? A very common discussion is whether or not reliability is a statistic issue; managing data has an undeniable usefulness in the companys management and direction; it is necessary to distinguish if statistics is used to manage real data and see its behavior or to support forecasts and estimations that sometimes border upon daring and irresponsible speculations. In maintenance, data of all types, quantity and quality are used and discussion about using large volumes of information should be placed in the responsible utilization thereof and not in their existence.

  • 6

    A real case of prudent information application was carried out by the US Aviation Industry in the 1960s, since it made a survey that showed that different elements failed in different manner and that even a particular element may fail in several manners. In a simpler manner: It is not the same changing an item because "It is going to fail" or changing it "because it failed" than changing it because a frequency was met "before it failed." Specifying, an item that failed due to wearing is not the same than another that failed due to an improper installation or one damaged by an accident. Some authors adhere to defining mathematical postulates as an absolute true about failures and deny the fact that numbers of analyzed failures mix effects with causes; in addition, they deny that having failure data to analyze is accepting that failures occur and, more data, more failures. The most common Concepcion of Reliability is like the average time between failure occurrences; this statement has several connotations to be considered, the first is to remember that the cipher is an average and that the failure concept is associated to more shutdowns than with unconformities such as spilling, non-conforming product, or increased risks which are failures too. Datum as such, is an average cipher; theres a great difference between probability and reality, thus many confusions are generated. A probable failure is a possible failure and an occurred failure is a real failure and, not necessarily a calculus logarithm assures its occurrence at a given point. Per example, a calculation produces 75% failure mathematic probability, for a component that in average has lasted 1,200 days in a defined operational context; this doesnt mean that it is not going to fail, or that the failure is immediate. Even more, if theres another having 95% probability, the later may fail afterwards and it doesnt mean that maintenance strategy is necessarily different, especially when causes have been mixing (failure due to lubrication or mounting error). Therefore, using calculated, desired, estimated, arbitrarily fixed, imagined, recommended by manuals and even invented ciphers, may carry themselves error percentages, inaccuracy and deficiencies requiring responsible handling. Per example, a boiler has the following failure causes:

  • 7

    Figure 1. Boilers example

    If failures are analyzed, the following results:

    No. Failure Cause Effect Generates

    shutdown?

    1 Dirty casing Increases fuel consumption. No

    2 Release valve is

    plugged at closed.

    In case of pressure increase, steam would

    not be released, thus increasing risk. No

    3 Combustion gas relief

    is partially plugged.

    Increases fuel consumption

    Non-compliance of environmental

    legislations.

    No

    4 Fuel system is bad

    adjusted

    Increases fuel consumption

    Increases gas issuance, non- compliance

    of environmental legislations

    No

    5 Bearing of the burner

    fan is worn out.

    Combustion air is not supplied and boiler

    turns OFF. Yes

    6 Steam release piping features corrosion.

    Piping is ruptured, theres a leak and someone may get burnt. There are associated damages.

    No

    7 Pumps motor Power cable has been bumped.

    Pump stops, water is not supplied and boiler turns OFF.

    Yes

    8 Water pump thermistor motor fails, it is closed.

    Upon a surcharge, engine would burn. No

    9 Forced Temperature Upon temperature increase, boiler would not No

  • 8

    No. Failure Cause Effect Generates

    shutdown?

    Sensor Signal (bridged)

    turn OFF, increasing risk.

    10 Dirty boiler Companys standards are not met. No

    It is clear that not all failures affect availability, therefore they could not be used in

    calculating MTBF, as it is recurrently used.

    Getting back to boiler failures:

    Assuming that 10 failure modes are produced within 720 hours (1 month).

    Only 2 of the above failure modes produce shutdown, generating a total of 20

    shutdown hours.

    According to the traditional failure concept, calculation of MTBF for boiler would

    be: MTBF = (720 hours 20 hours) / 2 failures = 350 hours.

    If for the company, the MTBF goal is 300 hours, the goal would be meeting.

    Probability that boiler does not fail before the MTBF goal would be calculated this

    way = e-(300/350) = 42,5%

    Thus, analyzing numbers only may give peace of mind to some people; however, there

    are other failure manners an asset may feature, such as:

    Incompliance of cleaning standards;

    Inoperative protections;

    Harmful situations for security and environment;

    Greater fuel consumption, that is greater cost.

    Then, if the asset does not perform all required functions as desired, it is considered a

    failure.

    Even more, if real failure concept is applied, calculations would be different:

    MTBF = 720 hours 20 hours / 10 failures = 70 hours.

    Since for the company the MTBF is 300 hours, the purpose would not be met.

    Probability that boiler does not fail (with the current failure concept) before the

    MTBF goal, would be calculated this way: Probability = e-(70/350) = 1.37%

    Very few companies have data on MTBF; they really have a datum on mean time

    between shutdowns.

  • 9

    Very few companies record failure occurrence at failure mode scope and some other do

    it, but their information systems difficult MTBF calculation.

    Conclusion: Time being used for mathematic calculation of MTBF or Failure Probability

    would be better used to define failure consequences and to define an action plan to

    mitigate these consequences.

    Data paradigm

    Business processes usually have few assets of a single type; the trend is putting them

    into operation in groups instead of doing it simultaneously.

    Sample sizes trend to be very small for statistic procedures to be really convincing.

    Assets are always in a continuous evolution and modification state, as a response to

    new operational requirements and in a try to eliminate failures having serious

    consequences or that preventing them is too expensive, this means that the time an

    asset is used in any configuration is relatively short, therefore, database is very small

    and it is constantly changing.

    Due to asset complexity and diversity, for most companies it is not easy to develop a

    complete analytic description of the reliability characteristics because many functional

    failures are not caused by 2 or 3, but for 2 or 3 dozens of failure modes.

    It is easy to plot the incidence of functional failures, but statically is difficult to separate

    and describe the failure pattern being applied to each failure mode.

    There are differences in Data Gathering Policy from one organization to another. One

    item may be removed from one place because it is failing while in another place it is

    removed because it has failed; similar differences are caused by different performance

    expectations.

    Resnikoffs Riddle

    Gathering information considered as great need for those who design the

    Maintenance Policy information on critical failures - is unacceptable at the beginning

    and it is evidence of failure of the maintenance plan. This is because critical failures

    cause potential deaths, but there is no death rate being acceptable for any organization

    as the price of failure information to be used to design a Maintenance Policy H.L.

    Resnikoff.

  • 10

    2.2 How to Improve Reliability? Currently, the issue faced by maintenance staff is not only learning what the new

    techniques are, but also being able of deciding what are useful or not for their own

    companies.

    If properly chosen and used in an integrated manner, possibly they will improve

    maintenance practices and outputs and, likewise, cost will be optimized. If improperly

    chosen, more problems will be created which in turn will worsen existing ones.

    Companies want to assure their future by means of defining strategies, planning and

    application of activities leading to achieve objectives related to availability, quality,

    security, environmental integrity, and effectiveness of satisfactory costs for owners,

    community, employees and clients. To meet these objectives, the companies have to

    exceed, control or establish challenges, such as:

    Improving Reliability: It is related to reducing failures in a time interval.

    Understanding as failure any event affecting Asset Performance.

    Reducing Risk: implies application of measures to minimize circumstances

    affecting a loss possibility.

    Improving Profitability: It is related to the capacity to generate profit or benefit; in

    other words, the relationship between profits and investment or resources that

    were used to achieve them.

    Using best practices: Implementing methods, tools, methodologies, procedures

    and processes that have been used by companies in a continuous and congruent

    manner and that have contributed in an efficient manner to achieve the best

    results in the performance of their assets.

    Meeting Legislation: Assuring compliance of the set of laws and rules establish

    by one State regarding a specified subject or issue.

    Supporting Growth: It is related to the growth of profits or the value of goods and

    services produced by a company; it is also related to certain indicators that as a

    whole show an organizations progress.

    Assuring Security: Looking after implementation of measures and actions to

    provide protection against specific risks.

    Assuring sustainability: Considering long term consequences to make sure that

    decisions being made are implemented for future requirements and obligations.

    Leadership: Influence being exerted upon people and that allows encouraging

    them to work for a common purpose, making right decisions.

    Improving productivity: It is related to growth of the relationship between the

    quantity of goods and services produced with a quantity of required resources.

  • 11

    Reducing vulnerability: It is reducing susceptibility of any system to a hazard

    impact.

    Compliance of environmental rules: it is related to adherence to Law governing the

    environment that determines the forms life including natural, social and cultural

    elements existing in a specific place a/o time.

    The above mentioned risks result in greater requirements as to maintenance activities

    and actions. New requirements and technological expectations have broadened tasks,

    responsibilities and requirements as to strategies, plans, programs, response times,

    competences, accuracy in implementation and organization of maintenance tasks. As a

    response to these challenges, there are the following strategies and processes to

    manage assets:

    Prospection

    Marketing Plans

    Purchase

    Planning

    Reliability

    Maintenance

    Risk Analysis

    Good Governance and Social Responsibility

    There are some companies that have gone beyond statistics and have reviewed their internal practices, and they carry out benchmarking with those which are outstanding. These organizations came to the conclusion that it is impossible to talk about reliability as a unique cipher; therefore, it is necessary to use several measurements as fundamental indicators of inputs/outputs of the processes. Need for reliability in installations is as old as humanity, but undeniably the growing relevance of environmental issues and their security have led to the need for changing orientation of some markets and niches, due to:

    More Complex products.

    High pressure to reduce costs and being able of competing.

    Greater number of operational functions carried out by equipment and machines.

    Requirements as to reducing products weight and volume, maintaining and improving performance and security standards.

    Requirements as to increase or reduce operation duration of products, to increase or reduce demand.

    Greater difficulties to carry out maintenance interventions due to asset utilization increase.

  • 12

    Trends to use software, electronic, pneumatic or hydraulic components having different wearing behavior in regard to components failing in function of age.

    Current Legislations increasingly more demanding and less tolerant.

    Greater impact of shutdowns and operational loses on sales and products.

    Growing level of demands in quality parameters of services and products.

    New conceptions of the image concept or companys commitment. Commitment to reduce human life loss risk.

    Request as to reduce spilling risk or affectations of the equipment on environment.

    Successful companies have made a concerted effort to incorporate their maintenance improvement strategies into other corporate initiatives, avoiding or preventing the syndrome of the campaign of the moment or the peak of the wave, or the promotion of the month. The best indication that this is effort produces satisfactions is supported on the fact that it turns into a durable and stable policy. These new demands drive the use of strategies that have been successfully applied in many companies strengthening global performance, optimizing costs, reducing risks, improving corporate image, lowering environmental impact and consolidating business results. Amongst the most successful tools being used and congruent, there are:

    Orientation towards reliability as a global concept, instead of reducing costs or downtime reduction.

    Carrying out diagnoses, audits and evaluation of maintenance practices.

    Definition and use of a development strategic plan describing and establishing a corporate vision related to reliability and asset good performance.

    Extensive utilization of performance measurements with appropriate goals. Using benchmarking to identify opportunities and barriers for improvement.

    Sharing knowledge and achieving consensus among areas typically separated, using teams with different functions and specialties, who work together during a specific period of time, analyzing problems and opportunities aiming at a common output.

    Challenges set forth by new generations of maintenance may be achieved by constructing a step by step scheme in order to overcome defined and verifiable stages. Determination as to choosing this path is given by recognizing strengths and weaknesses of maintenance management, in order to have a view of a desired status, expressed in objectives. These goals are justifiable if they are related to companys results and not only with maintenance operative results.

  • 13

    To Reliability, Maintenance is not the only responsible area. I t r e q u i r e s

    r e s p o n s i b l e d e s i g n s , consistent a n d t r a i n e d operators, professional

    purchasers and stable policies. In other words, several responsible actors take part

    during life cycle.

    Maintenance is considered as an action, which is a joint responsibility, more than a

    function: maintenance starts with selecting equipment, it follows with installation; it is

    supported on right operation and good maintenance, with support provided by

    purchases and inventories.

    For this reason, those responsible for being involved for assets to be reliable or not, are:

    Design.

    Selection.

    Manufacturing.

    Suppliers,

    Installation;

    Environment;

    Operation;

    Maintenance;

    Stores;

    Purchases. Conclusion: Improving MTBF is not enough.

    3. PRESSURE FOR RESULTS The commitment that sometimes becomes more heroic is reducing maintenance costs. Generally motivated by pressure to increase productivity and reduce costs. Many companies are specialized in not spending and have become a use source of a broad range of tools and methodologies that have emerged in the last years; generally they are focused on not increasing maintenance costs. This situation has led to make some deficient decisions that bring several benefits at short term, but rarely sustainable at long term and that may be harmful. We have to be aware of the direct costs reduction concept, because it aims at saving and it is more related to cutting; therefore, it is necessary to know the real impact of using appropriate and timely item in resource management, related to the impact on business activity. For that reason, any change should aim at improving the company and not at maintenance.

  • 14

    Then, outputs are not particular, they are integral for the entire organization, achieving:

    Optimum maintenance, operation and power consumption costs.

    Better utilization times of the assets: Quantity and results

    Reduction of accidents and incidents.

    Less environmental affectation

    Improvement of employment environment.

    Growth of satisfied customer percent. As a conclusion, a reliable Asset is effective, efficient, profitable, and secure, it does not affect the environment and produces little non-conformities. These outputs achieve direct cots reduction at business scope, instead of focusing on economy. Better practices may be applied independently of the organizational structure and the type of company. Reliability is a powerful tool to provide competitive advantages that may increase profitability, security, and customer and user satisfaction as well as respect. Although strategies and activities to achieve improvement may be very clear, and action points are easily listed and prioritized, the final output (transformation) towards a reliability corporate culture takes time. The larger a company is, cultural change takes longer. Substantial changes may be made within 5 years, and results start to be noticed upon first two or three years. People themselves show certain resistance to change and the way to manage it is different in each company, when employees already assimilate and accept new schemes, rather than resisting and doubting about usefulness, change is being achieved. Optimization initiatives usually lose impetus, one of the reasons is that people get acquainted with relationship change and look for new keys on how to act. If a communication plan has not been implemented as part of the change, those carrying out the job have time to get adjusted to the new function and do not find reasons to start something new. Some maintenance decisions have encouraged continuity of their traditional processes, even when operation a/o production and customer requirements change, instead of encouraging it, there is resistance to it. Maintenance reacting to changes in operation requirements in a manner more reactive than proactive is more than common.

  • 15

    Responsibility of a real maintenance strategist is accelerating evolution, involving employees in constant progress. Organizational structures are changing and their size is being reduced, but not their relevance. Those responsible for making decisions on systems, equipment and assets fully need to understand their responsibility and implications of the decisions they make. Thus, established dispositions will be properly defended. In other words, maintenance strategy should be totally auditable at the scope of indicators, methods, tools and processes

  • 16

    BIBLIOGRAPHY

    JONES, E. Constructing a Corporate Culture towards Reliability

    McGREY, M. Structuring Training in Reliability.

    MOUBRAY, J. Reliability Centered Maintenance.

    GULATI, R. Maintenance and Reliability best Practices

    SMITH, R. and MOBLEY K. Rules of Thumb for Maintenance and Reliability Engineers

    VESIER. Ph.D Carol. Benefits Achieved Through Reliability.

    VIOSCA, Robert R. Reliabilitys Ladder to World Class.

    HERNU, M. Effectively Using Benchmarking data.

    PETERSON, S. B. Designing the Best Maintenance Organization.

    PETERSON, S. B. Creating an Asset Health Care Program.

    MATHER, Daryl. Strategic Importance of Asset Management.

    SEXTO, Luis Felipe. OH! Statics.

    Perez Jaramillo, C. M. Management and Asset Life Cycle (Maintenance Evolution and Maturing). Soporte y Campania. Medellin.

    Perez Jaramillo, C. M. Reliability: Human Talent or Tools? Soporte y Campania. Medellin.

    Perez Jaramillo, C. M. Future of Maintenance Function. Soporte y Campania. Medellin. Retrieved from www.soporteycia.com

  • 17

    AUTHOR

    Carlos Mario Perez Jaramillo

    Mechanical Engineer. Information Systems Specialist Engineer. Asset and Projects

    Management Specialist. MBA in Project Management and Physical Asset

    Administration.

    RCM2 Professional of Aladon Network. Certified as Endorsed assessor and Endorsed

    trainer of Institute of Asset Management.

    Maintenance Management and Direction Adviser and Consultant. He has developed

    and supported application of Asset Management Models in food, mining, oil,

    petrochemical, textile, utilities, training and power companies.

    He has instructed RCM, failure analysis, maintenance planning and scheduling, costs,

    maintenance management indicators, life cycle cost analysis and standard PAS 55 for

    optimum asset management.

    He has worked en RCM divulgation, training and application, Maintenance management

    and Asset management in companies in Ecuador, Peru, Spain, Chile, Argentina, Cuba,

    Mxico, Panama, Costa Rica, El Salvador, Guatemala and Colombia.