72
HAZARD IDENTIFICATION, RISK ASSESSMENT AND CONTROL MEASURES FOR MAJOR HAZARD FACILITIES BOOKLET 4

Hazard Identification Risk Assessment and Control Measures for MHF

Embed Size (px)

DESCRIPTION

safety

Citation preview

Hazard identification, risk assessment and control measures for major hazard facilities

Of 47

NUMPAGES 47

HAZARD IDENTIFICATION, RISK ASSESSMENT AND CONTROL MEASURES FOR MAJOR HAZARD FACILITIES

BOOKLET 4

TABLE OF CONTENTS31Who is this booklet for?

2What does the booklet aim to do?33Hazard identification, risk assessment and control measures introduction34Hazard identification34.1The importance of getting the hazard identification right44.2Features of HAZID54.3Hazard identification processes and techniques84.4Review, revision and typical problems145Risk assessment165.1Risk assessment aims165.2Examples of risk assessment methods236Control measures326.1Introduction326.2What is a control measure?326.3Understanding control measures346.4Selecting and rejecting control measuresError! Bookmark not defined.6.5Additional or alternative control measures.396.6Defining performance indicators for control measures416.7Critical operating parameters446.8Involving employees in control measures456.9Control measures within the safety report and SMS456.10Reviewing and revising control measures466.11SMS - A suggested combination of key elements47

1Who is this booklet for?

This booklet has been produced for employers in control of a facility that has been classified by Comcare as a major hazard facility under Part 9 of the Occupational Health and Safety (Safety Standards) Regulations 1994.2What does the booklet aim to do?

This booklet provides guidance on key principles and issues to be taken into account when conducting an effective hazard identification and risk assessment at a major hazard facility (MHF) that is subject to Commonwealth legislation. It also describes the type and nature of control measures that the employer in control of an MHF should consider. These processes should be consistent with the safety management system.

This booklet is a guide to the intent of the Regulations but employers in control of an MHF should refer to the Regulations for specific requirements. In addition, this booklet refers to hazard identification and risk assessment techniques that may be appropriate for MHF purposes (and described in the facility SMS refer to Booklet 3) but these may not be the only techniques in use at a facility, other techniques include job safety analysis and task analysis.

3Hazard identification, risk assessment and control measures introduction

Hazard identification (HAZID) and risk assessment involves a critical sequence of information gathering and the application of a decision-making process. These assist in discovering what could possibly cause a major accident (hazard identification), how likely it is that a major accident would occur and the potential consequences (risk assessment) and what options there are for preventing and mitigating a major accident (control measures). These activities should also assist in improving operations and productivity and reduce the occurrence of incidents and near misses.

There are many different techniques for carrying out hazard identification and risk assessment at an MHF. The techniques vary in complexity and should match the circumstances of the MHF. Collaboration between management and staff is fundamental to achieving effective and efficient hazard identification and risk assessment processes.

4Hazard identificationThe Regulations require the employer, in consultation with employees, to identify:

a) all reasonably foreseeable hazards at the MHF that may cause a major accident; and

b) the kinds of major accidents that may occur at the MHF, the likelihood of a major accident occurring and the likely consequences of a major accident.

4.1The importance of getting the hazard identification right

Major accidents by their nature are rare events, which may be beyond the experience of many employers. These accidents tend to be low frequency, high consequence events as illustrated in Figure 1 below. However, the circumstances or conditions that could lead to a major accident may already be present, and the risks of such incidents should be proactively identified and managed.

Figure 1: HAZID focus on rare events

HAZID must address potentially rare events and situations to ensure the full range of major accidents and their causes. To achieve this, employers should:

a) identify and challenge assumptions and existing norms of design and operation to test whether they may contain weaknesses;

b) think beyond the immediate experience at the specific MHF;

c) recognise that existing controls and procedures cannot always be guaranteed to work as expected; and

d) learn lessons from similar organisations and businesses.

Some significant challenges in carrying out an effective HAZID are:

a) substantial time is needed to identify all hazards and potential major accidents and to understand the complex circumstances that typify major accidents;

b) the need for a combination of expertise in HAZID techniques, knowledge of the facility and systematic tools;

c) the possibility that a combination of different HAZID techniques may be needed, depending on the nature of the facility to ensure that the full range of factors (e.g. human and engineering) is properly considered;

d) obtaining information on HAZID from a range of sources and opinions; and

e) ensuring objectivity during the HAZID process.

Comcare must be satisfied that hazard identification has been comprehensive and the risks are eliminated or controlled before granting a licence or certificate of compliance to operate an MHF.

4.2Features of HAZID

Comcares expectations and some important features of HAZID

Comcare will expect:

a) a clear method statement or description of the HAZID process, defining when it was conducted, how it was planned and prepared, who was involved and what tools and resources were employed;

b) that the HAZID process was based on a comprehensive and accurate description of the facility, including all necessary diagrams, process information, existing conditions and modifications; and

c) that the overall HAZID process did not rely solely on data that was historical or reactive and that employers ensured that predictive methods were also used.

The HAZID process must identify hazards that could cause a potential major accident for the full range of operational modes, including normal operations, start-up, shutdown, and also potential upset, emergency or abnormal conditions. Employers should also reassess their HAZID whenever a significant change in operations has occurred or a new substance has been introduced. They should also consider incidents, which have occurred elsewhere at similar facilities including within the same industry and in other industries. Refer to the guidance material for Safety Safety Report and Report Outline guidance material (booklet 4) for the definition of significant change.

Involving the right people

An effective HAZID process is dependent upon having the right people participating in the process. The employer should:

a) involve Health and Safety Representatives (HSRs) in selecting staff to participate in HAZID;

b) involve HSRs in determining if the HAZID techniques are suitable for the staff selected;

c) ensure participants understand the relevant HAZID methods so that they can fully participate in the process; and

d) be alert for hazards that can be revealed by the combination of knowledge from specialists in different work groups.

Features of a HAZID process

The following aims to demonstrate the main features of a HAZID. Although the HAZID process chosen by employers must suit the circumstances of the MHF, the features noted below are generally applicable to all processes.

Preparation: Prior to commencement of the HAZID, the following steps should be completed:

a) Agreement on the purpose and scope of the HAZID;

b) Appropriate personnel and HAZID tools identified;

c) Sufficient resources and time allocated;

d) Clearly defined reporting processes and study boundaries according to the purpose and scope;

e) Appropriate background information and studies collated, such as historical incident data;

f) An agreed interpretation of major accident that is consistent with the Regulations and relevant to the facility.

System description: At the commencement of the HAZID, the complete system of assets, materials, human activities and process operations within the boundaries of the study should be clearly defined and understood, taking account of the original design, subsequent changes and current conditions. Typically, the system should be divided into distinct separate components or sections to enable manageable quantities of information to be handled at each stage.

Systematic evaluation and recording: The HAZID should move progressively through the system, applying the HAZID tools to each component or section in turn. All identified hazards and incidents should be recorded in some way. (See Figure 16 in this booklet for some examples of how hazard registers may be configured.) A checklist of guidewords, questions or issues should be considered at each stage.

Some key questions and issues could be:

a) What is the design intent, what are the broad ranges of activities to be conducted, what is the condition of equipment, and what limitations apply to activities and operations?

b) What are the critical operating parameters? What process operations occur, and how could they deviate from the design intent or critical operating parameters? This should consider routine and abnormal operations, start-up, shutdown and process upsets.

c) What materials are present? Are they a potential source of major accidents in their own right? Could they cause an accident involving another material? Could two or more materials interact with each other to create additional hazards?

d) What operations, construction or maintenance activities occur that could cause or contribute towards hazards or accidents? How could these activities go wrong? Could other hazardous activities be introduced into this section by error or by work in neighbouring sections of the facility?

e) Could other materials, not normally or not intended to be present, be introduced into the process?

f) What equipment within the section could fail or be impacted by internal or external hazardous events? What are the possible events?

g) What could happen in this section to create additional hazards, e.g. temporary storage or road tankers?

h) Could a particular section of the facility interact with other sections (e.g. adjacent equipment, an upstream or downstream process, or something sharing a service) in such a way as to cause an accident?

Past, present and future hazards

To identify all hazards, the HAZID will need to consider past, present and future conditions, hazards and potential incidents. Past incidents, at the MHF or similar facilities, provide an indication of what has gone wrong in the past and what could go wrong in the future.

A wide range of hazards and potential incidents will be present in the facility. New hazards and incidents could be created in the future as a result of planned or unplanned changes. The management of change process described in the SMS should identify new conditions during the planning of modifications or new activities. This should then trigger further HAZID studies and risk assessments, with the identification of control measures as appropriate. Figure 2 below illustrates the range of tools that can be used to identify past, present and future hazards.

Figure 2: Past, present and future hazards

4.3Hazard identification processes and techniques

HAZID techniques

The flowchart below summarises all the steps needed in a HAZID process and how those steps relate to one another.

Figure 3: HAZID process

Examples of HAZID Techniques

HAZOP

Hazard and Operability Study (HAZOP) is a highly structured and detailed technique, developed primarily for application to chemical process systems. A HAZOP can generate a comprehensive understanding of the possible deviations from design intent that may occur. However, HAZOP is less suitable for identification of hazards not related to process operations, such as mechanical integrity failures, procedural errors, or external events. HAZOP also tends to identify hazards specific to the section being assessed, while hazards related to the interactions between different sections may not be identified. Therefore, HAZOP may need to be combined with other hazard identification methods, or a modified form of HAZOP used, to overcome these limitations.

Equipment failure case definition

This method is a systematic approach to defining loss of containment events for all equipment within the study boundary. Process flow and equipment diagrams are studied systematically, and all equipment is assigned appropriate loss of containment scenarios, such as pinhole leaks, according to design, construction and operation. This form of hazard identification may be necessary for many major hazard facilities, to avoid missing potential scenarios, but is not sufficient on its own because it does not consider specific causes or circumstances. Therefore, this technique should only be used in combination with other techniques for MHF purposes.

Checklists

There are many established hazard checklists which can be used to guide the identification of hazards. Checklists offer straightforward and effective ways of ensuring that basic types of events are considered. Checklists may not be sufficient on their own, as they may not cover all types of hazards, particularly facility-specific hazards, and could also suppress lateral thinking. Again, this technique should only be used in combination with other techniques for MHF purposes.

What-If Techniques

This is typically a combination of the above techniques, often using a prepared set of what-if questions on potential deviations and upsets in the facility. This approach is broader but less detailed than HAZOP.

Brainstorming

Brainstorming is typically an unstructured or partially structured group process, which can be effective at identifying obscure hazards that may be overlooked by the more systematic methods.

Task Analysis

This is a technique developed to address human factors, procedural errors and man-machine interface issues. This type of hazard identification is useful for identifying potential problems relating to procedural failures, human resources, human errors, fault recognition, alarm response, etc.

Task Analysis can be applied to specific jobs such as lifting operations, moving equipment off-line or to specific working environments such as control rooms. Task Analysis is particularly useful for looking at areas of a facility where there is a low fault-tolerance, or where human error can easily take a plant out of its safe operating envelope.

Failure Modes Effects Analysis (FMEA)

FMEA is a process for hazard identification where all conceivable failure modes of components or features of a system are considered in turn and undesired outcomes are analysed. This technique is quite specialised and may require expert assistance.

Failure Modes Effects and Criticality Analysis (FMECA)

FMECA is a highly structured technique that is usually applied to a complex item of mechanical or electrical equipment. The overall system is described as a set of sub-systems and each of these as a set of smaller sub-systems down to component level. Individual system, sub-system and component failures are systematically analysed to identify their causes (which are failures at the next lower-level system), and to determine their possible outcomes, which are potential causes of failure in the next higher-level system. This technique is quite specialised and usually requires expert assistance.

Fault Tree and Event Tree Analysis

Fault Trees describe loss of containment events in terms of the combinations of underlying failures that can cause them, such as a control system upset combined with failure of alarm or shutdown and relief systems. Event trees describe the possible outcomes of a hazardous event, in terms of the failure or success of control measures such as isolation and fire-fighting systems. Fault tree and event tree analysis is time-consuming, and it may not be practicable to use these methods for more than a small number of incidents.

Historical records of incidents

Databases of incidents and near misses that have occurred are a useful reference because they give a very clear indication of how incidents can occur. Employers should consider site history, company history, industry history and possibly even wider sources of historical information for this purpose.

Examples of major accidents and the role of multiple factors in those accidents

Texas City, USA, 2005. An explosion at a large refinery killed 15 workers and injured over 170 others. Equipment upgrades and SMS elements including process safety information, communications and training were targeted for improvements following the incident. Total cost of plant upgrade reported to be 1 billion dollars over 5 years.

Longford, Victoria, 1998. Two workers were killed and eight others injured in an explosion at a gas processing plant. As a result, many elements of the SMS were targeted for improvements including process safety information and communication of critical safety information.

Pasadena, USA, 1989. A fire and a series of explosions at a refinery complex resulted in 23 fatalities. Inadequate and unofficial isolation procedures, together with human error induced by poor ergonomics played a role in causing the accident. The loss of life and scale of damage were increased due to poor plant layout and subsequent damage to fire-fighting systems. Piper Alpha, UK, 1988. The accident was triggered by a small leak in a condensate pump system, which by itself would most likely have had only minor consequences. However, in combination with failures in management systems, design and equipment, the event resulted in the loss of 167 lives and destruction of the entire platform.

Bhopal, India, 1984. Half a million people were exposed and over 20,000 have died to date as a result of a release of methyl-isocyanate via a vent stack. A range of systems and equipment had been malfunctioning or were taken out of service over a period leading up to the disaster, including a safety system for scrubbing tank vent releases, but this was disabled because the plant was shut down and not considered to be a risk. After the plants construction, a large shanty town had grown up around it, but this had not led to any recognition of changes in risk.

Flixborough, UK, 1974. A modification was made to a bypass for one of a series of reactor vessels. Due to the urgency of the work, and the fact that there had been significant organisational change, the modification was designed and constructed inadequately. The bypass failed, releasing a large cloud of cyclohexane, which exploded and killed 28 persons. Not only was the new hazard not considered during the modification, it was not recognised during subsequent operations even though the bypass was seen to move as process pressure rose and fell.

Human factors and the nature of hazards an overview

Human factors are defined as the interactions between people and the organisation, systems and equipment they interface with. Consideration of the effect of human factors on risk is sometimes defined as fitting the work to the employees or the science and practice of designing systems to fit people. The subject of human factors is concerned with understanding the capacities and limitations of people in their jobs, and using this understanding to eliminate or control the effects of human weaknesses and exploit human strengths. In this context, human weaknesses may include limitations on information processing capabilities, while human strengths may include adaptability.

Some facilities may find that human factors are a major contributor to the nature of hazards at the facility. It is expected that in this case there will be thorough evaluation of the causes of this. Analysis of human errors may require procedural reviews or human factors analysis.

From the major hazard control perspective, the role of people is critical to the safe operation of major hazard facilities and should be addressed in the safety report. Accordingly, employers should incorporate human factors into relevant aspects of the operation of major hazard facilities, including the SMS, hazard identification, risk assessment, control measures, the safety role of employees and contractors, emergency planning and training.

Human factor HAZID techniques

When identifying human factor hazards, the employer should examine the foreseeable major accidents and consider how human factors may contribute to or cause those accidents. It is important employees with experience in the specific area being studied participate to identify the hazards that may be present.

This should involve identifying the ways in which just being human can influence the performance of individual tasks and roles. For example, a person may operate a wrong valve by mistake, or may inadvertently use the incorrect procedure. Examples of being human include: memory limitations, visual acuity limitations, information processing problems, distraction, fatigue, decision-making biased by experience and knowledge, rigid problem solving, susceptibility to following group behaviour. These can all adversely influence human actions and decisions leading to the possible creation of hazards.

It is important to acknowledge that human factors can introduce hazards at all levels, from performing individual operations and maintenance tasks, through to designing the facilities, writing the procedures and even setting standards and policy for the organisation. Human factor hazards can also be latent hazards, in that they will not be revealed until particular circumstances combine to make them obvious. For example, the transfer of a key operational supervisor into a non-operational project role may lead to a deficiency of knowledge within the operations team; this deficiency may not become apparent until an emergency occurs. Accordingly, while it is necessary to involve first-line operations and maintenance personnel in the hazard identification, it is also necessary to consider wider issues than the day-to-day roles and activities of these persons.

When considering the type and level of human factors input that is needed in hazard identification, employers should consider their specific circumstances, and in particular, the amount of reliance they place on human actions and decisions in the prevention and control of major accidents. Cases where detailed consideration of human factors might be appropriate include a process plant that requires employee action to prevent or control emergency situations or a dangerous goods warehouse that relies heavily on procedural controls to ensure correct segregation of goods.

In addition to calling upon the necessary range of operations personnel to take part in the hazard identification, it may also be appropriate to use persons having specialist human factors knowledge. This specialist knowledge may be essential if human factors hazards can influence critical safety controls.

Human factor HAZID techniques are evolving and are based on methods developed from engineering HAZID methods. They follow the same principles and can be conducted in conjunction with an engineering HAZID.

Task analysis

An important set of human factors techniques, which can be used in all areas of human factors consideration, is a set of methods collectively called task analysis. Task analysis is not only used in HAZID but is also a tool for risk assessment and development of control measures to accommodate human factors.

Task analysis is used to study what a person, or team, is required to do, in terms of actions and/or mental processes to achieve a system goal. The information used in and derived from a task analysis will depend on the technique used and the objective of the analysis.

Examples of task analysis techniques include:

Questionnaires and structured interviews Task simulation

Link analysis Work safety analysis

Hierarchical Task Analysis (HTA) Event/fault tree analysis

Timeline analysis Human Factors HAZOP

HTA is one of the most commonly used task analysis techniques. It is used to systematically analyse a task or series of tasks. The outcomes of the HTA will depend on the reasons for its use. For example, if a new control room is being designed for a process facility, the design layout and equipment available in the control room should be tested to ensure that it is appropriate for handling all foreseeable operations (start-up, normal, abnormal). If HTA is used to assess workload, the information, processing and time requirements of the task, or tasks, should be tested.

4.4Review, revision and typical problems

Review and revision of hazard identification

The Regulations require that the safety report must be reviewed in particular circumstances and reviewing the HAZID is part of this requirement. Consequently, the overall HAZID process should include regular and pro-active reviews to identify any new hazards and to refresh knowledge of existing hazards. The HAZID process should include a range of triggers for individual HAZID studies. These triggers may be a scheduled program of reviews or could arise from information gathered during regular safety meetings.

Use of the HAZID results

The value of a high-quality HAZID, and the major commitment of resources required, demand effective use of the HAZID. However, the cost of a major accident will be far more significant than that for the process that could prevent it.

The hazard register should facilitate the process of revisiting and updating the knowledge of hazards and incidents within the facility. The register should communicate clear linkages between the employers processes for hazard identification, risk assessment and the selection or rejection of control measures.

Lateral thinking and realism in HAZID

History clearly shows that major accidents often arise from a set of complex conditions or coincidental events, which may include multiple failures in both equipment and procedures. The initial conditions that lead to these major accidents might have been relatively minor problems, but they developed into major accidents because other problems arose concurrently.

Companies and individuals can exhibit corporate blindness when identifying or reporting hazards by assuming that the systems and procedures in place only ever function as intended. It is possible that other safeguards or pure luck will prevent a major accident from occurring, but the employer should consider the possibility that several events may at some time combine to cause a serious incident. These situations arise in reality, and should be allowed for in HAZID.

Use of worst-case scenario in HAZID

Employers should include all foreseeable hazards in the HAZID and transparently analyse each event within the risk assessment. The employer should consider all possibilities, on a case-by-case basis, and document any assumptions regarding the definition of worst case. The employer should then be in a position to define the worst-case scenario for the facility. By using risk assessment techniques employers should be in a position to identify worst-case scenarios.

Definitions of what constitutes the worst-case scenario can be difficult to assess. The worst-case scenario is sometimes incorrectly deemed to be the largest event within the capacity of the on-site protection systems, simply on the basis that any event worse than this cannot be planned for.

However, such events are merely the worst that has been allowed for during design and are not necessarily the worst that can occur. Most examples of major accidents given above clearly exceeded the design basis, which is why they resulted in such serious outcomes. Both the design events and the true worst case events are required to be considered.

It should also be recognised that the worst case in terms of the distance of impact might not be the worst case in terms of potential consequences. It may be necessary to consider both these consequences. The worst-case scenario for one area of a facility may not be the same as that for another area of the same facility. This will depend on a large number of factors such as materials normally or not normally present, extreme process conditions, isolation systems that may fail, the proximity and the layout of vessels and the presence of personnel. Employers should consider all available information, including historical incident records, in deriving the worst-case scenario.

Common mistakes in HAZID

A common mistake in hazard identification is to screen out or discard some incidents because they are perceived to be extremely unlikely or of low consequence. Incidents may be unlikely or of low consequence only as a result of the control measures in place. However, a key purpose of the HAZID and risk assessment process is to identify critical control measures and to determine their effectiveness. Therefore, it would be self-defeating to disregard incidents because control measures are in place. All potential major accidents should be recorded during the HAZID. Any screening, analysis and assessment of the hazards, their consequences or the effectiveness of the controls, should occur during the risk assessment. In practice, the HAZID and risk assessment processes are often combined into one workshop (or similar) which makes this distinction difficult. This ensures transparency and the ability to audit the entire process. This is necessary for the justification of the adequacy of control measures.

Other potential pitfalls that should be avoided include:

a) being too generic in identification of hazards and potential major accidents;

b) limiting the hazard identification to the immediate cause of potential major accidents, without determining the fundamental underlying cause;

c) attempting to conduct the risk assessment and assessment of control measures during the hazard identification. Except for very simple facilities, it is almost certainly better to separate hazard identification from the subsequent stages; this helps ensure a systematic HAZID process. However, in practice, the HAZID and risk assessment are often combined so that the risk assessors are aware of the hazards identified and any other information discussed during the HAZID;

d) widening the scope to include too great a range of incident types, such as all occupational health and safety issues. If these issues must be considered they should only be the issues relevant to controlling the risk of a major accident;

e) carrying out the HAZID with incomplete or inaccurate facility descriptive information;

f) proceeding with the study without first having developed, agreed and planned the approach and the method of recording. A pilot study on a selected area of the facility, may be beneficial in deciding on an effective HAZID approach;

g) failing to be comprehensive and systematic with respect to the activities, operations and possible different states of each part of the facility;

h) failing to record important information discussed during the HAZID, e.g. assumptions, uncertainties or debated issues and gaps in knowledge;

i) allowing the hazard identification workshops to be dominated by individual persons or groups within the organisation; and

j) where HAZIDs are conducted across several sessions - failing to review previous session findings or remind participants of the scope and objectives.

5Risk assessment

5.1Risk assessment aims

The aims of risk assessment are to:

a) provide a basis for identifying, evaluating, defining and justifying the selection of control measures for eliminating or reducing risk, and to therefore lay the foundations for demonstrating the adequacy of the standards of safety proposed for the facility;

b) provide the employer and employees with sufficient objective knowledge, awareness and understanding of the risks of major accidents at the facility;

c) capture knowledge of risk of a major accident at the facility so it can be managed, disseminated and maintained. The management of knowledge generated in the risk assessment will also greatly assist the efficient development of a safety report for the facility, for example by handling assumptions and actions arising; and

d) provide practical effect to the employer's safety report philosophy. For example, if the employer intends to base the safety report largely on the facilitys compliance with specific codes or standards, the risk assessment should address corresponding issues such as the basis of the codes and standards and their applicability to the facility.

Creating and transferring knowledge using risk assessment

Understanding the risks of major accidents may be accompanied by uncertainty, but the risk assessment will be successful if it reduces this uncertainty to an acceptable or tolerable level. The results of risk assessment must be captured and disseminated to those who require the knowledge, to enable the uncertainty of the entire organisation to be reduced to an acceptable level.

An effective risk assessment should involve the processes of debating, analysing, sharing views and generating information and knowledge on the risk of major accidents and their means of control. It should include the active participation of employees or contractors who influence safe operations. This is the definitive criterion for participation of employees and contractors. Formal roles or assumptions about someones role are irrelevant to producing a high quality risk assessment.

There are no limits to the activity or sources that can be used to understand the facility and its risks. For example, it could incorporate information from incident investigations, discussions during safety meetings about hazards and ways of controlling them, condition monitoring programs, analysis of process behaviour, evaluation of process trends or deviations from critical operating parameters, procedure reviews or flood or weather records.

Risk assessment results are a useful input to training needs analyses. For example, if a procedure or task carried out by employees is an important control measure that can fail if there is inadequate employee knowledge, then the risk assessment should identify that risk and the need for that knowledge. It can then be a tool to assist in imparting that knowledge to employees, either by direct involvement in their defined roles or as a source of information to develop instruction or training sessions.

The above example illustrates the importance of knowledge management in the process of complying with the Regulations: the HAZID and risk assessment generate knowledge, and this knowledge should be captured and implemented via the safety management system and the processes of consulting, informing, instructing and training. It must be recognised however, that assessing safety is not the same as managing safety, and risk assessment is only worthwhile if it informs and improves the decision-making and implementation processes. Reducing uncertainty should balance the improvement in the effectiveness of decisions against the cost of additional assessment.

The SMS should instigate risk assessments to maintain a comprehensive and up-to-date understanding of risk as the facility changes. Risk assessment may be triggered via the management of change process at the facility. The results of risk assessments would be expected to feed into the development of the SMS.

Identifying and evaluating control measures using risk assessment

The risk assessment should consider a range of control measures and provide a basis for the selection of control measures. Risk assessment can be a useful tool, which can save or optimise the use of resources, by determining the effectiveness and costs of different control options, improving the decision-making process and providing a basis for allocating resources in the most effective manner. The risk assessment process should provide the following in relation to control measures:

a) identification or clarification of existing and potential control measure options;

b) evaluation of effects of control measures on risk levels;

c) basis for selection or rejection of control measures and the associated justification of adequacy; and

d) basis for defining performance indicators for selected control measures.

The range of control measures that should be considered in the risk assessment is addressed later in this guidance material. The risk assessment should evaluate the range of control measures in terms of viability and effectiveness to provide a basis for selection or rejection of each control measure:

a) Viability relates to the practicability of implementing the control measure within the facility; and

b) Effectiveness relates to the effect of the control measure on the level of risk. For example, the reliability and availability of control measures influence the likelihood of an incident occurring, while the functionality and survivability of the control measures during the incident influence the consequences.

Specific studies may be carried out as part of the risk assessment to evaluate these issues for individual or groups of control measures.

By evaluating options for control measures within the risk assessment the employer should be able to determine what additional benefit is gained from introducing additional or alternative control measures. If these do not result in any reduction in risk, the basis for rejection is apparent. The employer should look for gaps in the existing control regime, where the introduction of further control measures may be necessary.

Using the risk assessment to set performance indicators

The risk assessment should generate information useful to the setting of performance indicators for the adopted control measures. For example:

a) matching performance indicators with the control measures control measures with more rigorous performance standards are more likely to be associated with the high consequence hazards than the lower consequence hazards;

b) control measure functionality, including reliability, reflecting the scale of incidents being controlled;

c) reliability, or number of control measures, reflecting the likelihood of the corresponding incidents.

Overall framework and principles for risk assessment

There are fundamental questions most forms of risk assessment attempt to address to ensure the risk assessment is comprehensive and systematic (see Figure 4). Figure 4: Basic questions within risk assessment

The risk assessment should use assessment methods (quantitative or qualitative or both) that suit the hazards being considered. This means that the tools employed must be selected according to the nature of the risk. A tool that does not address any variability or uncertainty in the nature of the hazards and incidents identified can fail to generate the necessary understanding and provide no basis for differentiating between control measures.

There is no single tool able to meet all the requirements for risk assessment, and all tools have limitations and weaknesses. For example:

If the dominant contributor to a major accident relates to aging of equipment and associated mechanical integrity problems, then an analysis of mechanical integrity, corrosion rates, breakdown data, reliability and inspection/testing/maintenance issues may be necessary to develop the required understanding. In such a case, a quantitative risk assessment (QRA), which is usually based on generic data, may not provide the necessary information or lead to effective solutions.

Similarly, if a facility employer has identified human error as a key risk driver, then a Task Analysis, Human Reliability Analysis, or detailed analysis of the operating procedures may be appropriate. Analysis of equipment condition and reliability in this case would probably not be effective.

For many facilities, there may be several types of assessment required. In the interests of efficiency, it is desirable to clearly identify the types of detailed study required, before following any particular route. Two basic tools can assist this process, they are preliminary/qualitative risk assessments and hazard or risk ranking. There are plenty of examples of both types of tool, but they all have a common purpose - to determine the nature of the risk in terms of the basic causes, likelihood, consequences and controls.

Where it is clear that the employer has insufficient knowledge of causes or likelihood, detailed studies may be needed. A preliminary evaluation should point towards the types of detailed study required. An appropriate ranking methodology allows the key areas to be identified and prioritised. It enables the employer to determine if the gaps in knowledge correspond to what may be major risk contributors.

Priority should be given to those areas where it is obvious there is likely to be a high risk and there are also gaps in knowledge about the things giving rise to the risk. Some iteration may be required where the ranking of key areas is revisited following detailed assessment, to see if any hazards have increased in rank and now require more detailed study. Figure 5 aims to illustrate the relationship of preliminary evaluation, ranking and detailed studies.

Figure 5: Relationship between preliminary evaluations, ranking and detailed studies

The above discussion introduces the concept of a "tiered approach" that is frequently used in risk assessment. If a simple technique generates the information required by the Regulations and also generates sufficient understanding of the risk and the options for its control, further risk assessment may not be necessary. However, if substantial uncertainty remains, or the employer wishes to look at a range of options in greater detail, then further effort is justified and more detailed tools may be desirable. In general, greater assessment effort should result in a more quantitative, accurate and robust understanding, thereby allowing a more transparent and rational basis for decision-making.

The key to the tiered approach is that, at each stage, the employer should compare the potential cost of increasing the detail of the assessment against the benefit that further assessment may give. In this context, the benefit may be a higher level of knowledge of the hazards and the risk, or may be a better understanding of the optimum means of controlling the risk (described in Figure 6).

Figure 6: Tiered Approach to Risk Assessment

Some facilities may use a semi-quantitative risk assessment where qualitative brainstorming sessions of staff are combined with quantitative studies and information. If data and knowledge have been collected previously about the MHF and remain relevant to risk assessments under the MHF Regulations, it is acceptable to make use of that data and knowledge.

Cumulative assessment of hazards in the risk assessment

The risk assessment must consider hazards cumulatively and individually. The effects of several hazards occurring in combination must be considered. Many major accidents have been caused by the realisation of a number of hazards concurrently. For any accident there may be several independent hazards or combinations of hazards, each of which could lead to that accident, and several control measures which may be particularly critical because they may influence one or more of those hazards. The risk assessment should give an understanding of the total likelihood of each accident and the relative importance of each separate hazard and control measure.

The potential for escalation of major accidents, and the consequences of this which may be greater than an event in isolation, need to be considered along with the consequences and their effects (e.g. number of injuries, extent of property damage).

A facility may have a range of major hazards that could lead to potential major accidents. Both the highest risk incidents and the overall profile of risks from all incidents must be determined, so that the risk can be shown to be adequately controlled. In cases where a large number of different hazards and potential accidents exist, the cumulative risk may be significant even if the risk arising from each event is low.

The "bow tie" diagram (Figure 7) is similar to a combined fault and event tree that shows how a range of causes, controls and consequences can be linked together and associated with each major accident scenario. Cumulative consideration of the hazards can be seen as the overall evaluation of interactions between different parts of a single bow tie or consideration of a range of bow ties together.

Cumulative consideration of hazards enables the employer to assess the overall risk picture for the facility and to understand how different causes and events can combine to lead to an accident. It also enables the key causes and controls for the risks to be identified and evaluated in more detail if required.Figure 7: Examples of bow tie diagram

Uncertainty and assumptions in risk assessment

Handling uncertainties and assumptions in risk assessment is a difficult issue, but necessary. A complete and accurate understanding of the risk of major accidents is unlikely. Uncertainty cannot be eliminated and it will be necessary to make assumptions in some areas. The key is to record and test assumptions wherever possible and to explicitly recognise where the main gaps or uncertainties exist.

Where important assumptions are made (e.g. assuming a control measure has a high level of effectiveness), the employer should ensure these assumptions are consistent throughout the safety report, and are implemented, tested and confirmed in practice.

Employers should consider using sensitivity analysis to test the robustness of the risk assessment results against variations within the key areas of uncertainty. This may involve changing key assumptions and determining if the changes in results would affect any decisions that have been made based on those results. Where sensitivity analysis or consideration of the gaps in knowledge indicates a significant level of uncertainty or poor confidence in the resulting decisions, further detailed assessment may be required.

Is quantitative risk assessment required?

QRA is only one tool within the risk assessment toolkit. It involves calculation of the frequency and consequence of a range of hazardous events and numeric combination of these to estimate the risk in a numerical format to allow direct comparison of results, including a measurement of risk to nearby neighbours and society as a whole.

However, it is not a requirement of the Regulations that a QRA is performed. The methods used must be appropriate to the hazards, the nature of the options available, the facility safety philosophy and the decisions that are required from the risk assessment. A decision not to perform QRA does not preclude the employer from carrying out specific quantitative calculations regarding frequencies, consequences or other aspects of the risk.

If QRA assists understanding of the risks and the appropriateness of control measure options, then it should be considered as a tool to support the risk assessment. However, QRA may not provide all the answers and is typically best suited to differentiating design, layout, location and engineering options. QRA is generally considered to be most useful for quantifying off-site risk; however, it can be useful in assessing on-site risk if sufficient detail and an understanding of the reality of peoples response to accidents are included.

5.2Examples of risk assessment methods

To recap, risk assessment must involve an investigation and analysis of the identified hazards and major accidents, so as to provide an understanding (and documentation) of the:

a) nature of each hazard and major accident

b) likelihood of each hazard causing a major accident

c) magnitude of each major accident

d) severity of consequences of each major accident to persons on-site and off-site

e) range of control measures available to control each major accident

f) effectiveness and viability of control measures for each major accident

g) individual and cumulative effects of hazards

Each of these aspects is discussed below, with examples to illustrate the concepts, together with discussion of simplified, overall, preliminary qualitative methods of risk assessment, that may be used to focus the detail of the assessment onto the high-risk cases.

This section is not intended to be a detailed or comprehensive description of risk assessment methods. The methods and figures shown below are purely selective examples to illustrate the approaches and are not a recommendation for any specific application.

Preliminary or qualitative risk assessment

The most common form of preliminary or qualitative risk assessment is a risk matrix, which assesses individual incidents in terms of categories, e.g. low, medium and high, according to their expected consequence and likelihood. An example of the risk matrix approach is provided in Australian Standard AS4360 (Risk Management). Risk matrices may need to be tailored to the requirements of the MHF Regulations as they are not typically designed for very low frequency events. Risk nomograms provide an alternative approach; although it is little used in practice (see Figures 8 & 9). These methods can provide a relatively rapid understanding of the risk profile of the facility and can be based on judgment or be refined using more detailed information. However, the understanding gained will be relatively coarse, and the methods have limitations. For example, it is not easy to incorporate the effects of risk reduction measures within the risk matrix, and neither method is easy to use to assess cumulative hazards, in particular at facilities where a large number of hazards exist. To assess such issues, methods that are more detailed are likely to be required.

When using risk matrices or nomograms it is important to define individual incidents or scenarios on a consistent basis so that comparable events are assessed. For example, a risk matrix could be used to evaluate specific outcomes of incidents or individual incidents or a specific cause of an incident. The likelihoods and consequences would be defined very differently depending on which definition of incident is used. Hence, the employer should decide how incidents will be defined and use the same approach for all incidents. A balance must be struck between defining events in sufficient detail and defining too many events to manage in the assessment.

Figure 8: Example of a risk matrix

Figure 9: Example of a risk nomogram

Ranking methods

Most forms of preliminary risk assessment can be used as a basis for ranking different incidents to establish their approximate order of importance. In the risk matrix example, a simple scoring system can be introduced to represent the combined effect of likelihood and consequence. For example, the highest-ranking incident is m.a.7 (i.e. major accident number 7) with a score or risk index of 16, closely followed by m.a.12 with a risk index of 15. The sum of the risk indices for all incidents is 76; therefore, the contribution of incident m.a.7 is 16/76 or about 21% of the cumulative risk. Note that the risk index on the matrix is a multiplication of the numbers assigned to the rows and columns NOT an addition.

An extension of the above scoring approach is to define a range of specific factors that affect the likelihood or consequences of each incident. For each factor, each incident may be given a score such as from 1 to 5 or a simple rating such as low, medium or high based on specific, established criteria. The scores for each incident are then added to give an overall likelihood, consequence or risk score for each incident.

Investigating and analysing the nature of hazards and major accidents

A range of different techniques is available for investigating and analysing the nature of hazards and major accidents. The MHF Regulations do not prescribe a specific method; therefore, it is the employers responsibility to determine the most appropriate for the circumstances of the particular MHF.

To assess the nature of hazards and potential major accidents requires knowledge of what may go wrong within the facility if measures to eliminate or prevent accidents are not present. Depending on the different types of hazards and potential outcomes, the employer may need to employ a combination of techniques to develop a complete understanding.

Techniques, which may have been used for identifying hazards and accidents, can sometimes also be used in the risk assessment to assist in understanding the nature, consequences and likelihood of the hazards and their control. For example, while HAZOP is primarily a tool for hazard identification, the HAZOP process can also include assessment of the causes of accidents, their likelihood and the consequences that may arise, so as to decide if the risk is acceptable, unacceptable or requires further study. However, within the scope of a combined HAZID and risk assessment workshop, this assessment would necessarily be coarse, qualitative and subjective and would in many cases need to be supplemented by more detailed assessment outside the workshop.

A HAZOP would not necessarily be the appropriate technique for detailed analysis of the causes of some other types of accidents (e.g. failures within complex electrical or mechanical equipment). In such cases, a failure mode effects and criticality analysis (FMECA) may be more useful and supplemented by whatever mechanical integrity information already exists for the systems within the facilitys maintenance and breakdown records.

Alternatively, a Fault Tree Analysis may provide the necessary understanding of the nature and causes of different types of hazard. In many cases, an FMEA may be used to identify what can go wrong, and how low-level failures may affect higher-level systems. A Fault Tree may then be used to show how low-level failures, combined with external aspects such as loss of power supply or human error may combine to cause overall system failure. The Fault Tree can also be used, in principle, to estimate the likelihood or frequency of the failure occurring.

Figure 10: Example of a Fault Tree

Investigating and analysing the likelihood of hazards causing major accidents

Likelihood analysis is a complex and potentially difficult process. The range of values in this process varies from may occur within a plant lifetime, through to extremely rare or never known within industry. Likelihood is highly dependent on a range of site-specific factors such as the number of equipment items, its condition, activity frequencies, the quality of the management system and human error levels.

Likelihood analysis should consist of a mixture of qualitative and quantitative information that, overall, gives an indication of likelihood of each incident. This may be based on calculations of basic task frequencies, analysis of how often errors are made, the reliability of safety devices, previous incident history or near miss data for the facility or industry sector.

If historical data is used, it should be critically assessed and the sources of such information documented. If simple judgment or other techniques are used, the employer must record the assumptions and logic used to determine likelihoods.

Likelihood analysis flows directly from the preceding process of assessing the nature of hazards and accidents. Further evaluation of information generated in these processes may be carried out to derive an understanding of likelihood. For example, Fault Trees can be used to produce a qualitative or quantitative understanding of the likelihood of different hazards and how they may combine to lead to a specific accident.

Event Trees may be used to determine what alternative outcomes may arise from an initial event, and the relative likelihood of each outcome. Again, it is possible to develop qualitative or quantitative event trees. Event trees also assist in defining the significant consequence scenarios, which need to be evaluated in detail.

Both Event Trees and Fault Trees can be used to evaluate quantitatively or qualitatively what effect existing or potential control measures have on risk levels. The effects of control measures assumed in these assessments should be reflected in the performance indicators defined for the control measures.

Figure 11: Example of an Event Tree

Human factors and likelihood

Human factors can have a major effect on the likelihood of breakdowns, hazards or overall incidents. It is very difficult to fully quantify the effects of human factors on likelihood; however there is some robust general information on human error rates that may be utilised in risk assessment.

Investigating and analysing the magnitude and severity of accidents

The magnitude and severity of accidents can be determined using consequence and impact analysis. Consequence analysis involves calculation of the size and duration of the physical and or chemical effects of accidents, while impact analysis involves determination of the harm done to people and property.

There are complex computer packages available for consequence analysis, but also simple equations and nomogram techniques for some cases such as the radiation from pool fires or the toxic gas cloud formation from releases of chlorine. Typical consequences which need to be considered within a risk assessment are toxic exposure from gas clouds or smoke inside or outside buildings. The selective use of worst-case consequence modelling can improve the efficiency of a process when it is necessary to identify which areas of the facility can cause offsite effects.

It is necessary to also consider less than worst-case conditions to develop a comprehensive understanding of the risk. The Regulations apply equally to onsite and offsite populations, and the worst-case scenarios for onsite and offsite populations may be very different.

The worst-case approach involves defining the credible combination of conditions giving rise to the maximum consequence zone for the identified accident, in relation to the target population. This can include defining release quantity, duration, pressure, composition, location, wind speed and other atmospheric conditions, time of ignition and functioning of control measures.

It is common to assume the worst-case release quantity is the maximum vessel contents, released over a defined period of time. However it should be noted that this cannot be assumed to be the correct assumption for all types of plant or storage area. Where there is a clear mechanism for releasing more than the maximum vessel inventory, this should be considered in the consequence analysis.

Active control systems such as isolation valves and blowdown systems need to be assessed for worst-case scenario. Passive control measures that are assured of functioning in the event of the worst-case accident may be included in the assessment.

The impact distance in all directions from the release point should be determined allowing for the fact that the wind can blow from any direction. The impact distance is usually determined to a predefined consequence criterion which can be material and/or effect specific.

For example, LPG flash fires will occur to a defined lower flammable limit while LPG can also produce fireballs or jet fires that can cause injury or damage from thermal radiation effects. Thermal radiation criteria are the same for all flammable materials. Employers should carefully define all relevant consequence criteria based on their definition of a major accident.

Below is an example of one method for illustrating the consequences and effects of a major accident. The example is a major accident involving a pool fire. This method may prove helpful during the risk assessment process and, if used, should be included in risk assessment documentation.

Figure 12: Pool fire consequences & effects

The employer should consider consequences under a range of meteorological conditions. Usually the worst-case meteorological condition for toxic or non-dense gas releases is high atmospheric stability and a low wind speed, typically experienced at night-time or in the very early morning. For dense gases, the worst-case condition is typically a high wind speed, which tends to occur at neutral atmospheric stability and during the day. Definitions of stability and other environmental conditions can be found in safety or meteorology literature.

Ambient temperature and humidity may also affect the consequences of releases. In particular, high temperature can increase the flammable effect range of low volatility materials. Surface type and topography can also affect the consequence, such as a spill into water or onto sloping areas.

For flammable materials the consequences should be analysed both when ignition occurs immediately following the release, and if ignition occurs after sufficient delay for a flammable cloud to fully develop. Further factors to consider include day versus night conditions, extreme weather conditions such as flooding, storms and including cyclones for facilities located in cyclone-prone areas of Australia.

To evaluate the impacts of major accidents on people, it is necessary to consider the number and distribution of potentially exposed people, and their characteristics. Variations in these factors should also be considered such as temporary populations, maintenance crews and on-site populations for specific operational modes. A further factor that should be considered is that people such as emergency services or investigators may be present specifically because there is a developing incident.

Presentation of risk assessment results

Risk assessment results can be presented in many different ways (see the example below). Again it is important to provide the results showing explicitly which control measures are reflected in each result.

Figure 13: Example risk assessment results

Key achievements for a quality risk assessment

The risk assessment is conducted for all hazards and potential major accidents at a facility, ensuring that:

a) it is comprehensive, systematic, rigorous and transparent;

b) it generates all information required by the Regulations, and provides employers with sufficient knowledge to operate safely;

c) the knowledge is kept up to date, through review and revision;

d) the information is provided to persons who require it to work safely;

e) an appropriate group of employees is actively involved;

f) uncertainties are explicitly identified and reduced to an acceptable level;

g) all methods, results, assumptions and data reflect the nature of the hazards considered and are documented;

h) a range of control measures are considered and their effects on risk are explicitly addressed;

i) it supports the development of the safety management system;

j) it is used as a basis for adoption of control measures, including emergency planning; and

k) it is used as a basis for the demonstrations in the safety report.

6Control measures

6.1Introduction

The previous sections discussed key elements for the range of control measures that should be in place at an MHF. This section provides more detailed guidance on how to select and judge the effectiveness of specific control measures. Choosing the best control measures and being able to demonstrate their effectiveness is a critical feature of compliance with the Regulations.

6.2What is a control measure?

A control measure is the part of a facility, including any system, procedure, process or device that is intended to eliminate hazards, prevent hazardous incidents from occurring or reduce the severity of consequences of any incident that does occur.

It is the principal tool that delivers safe operation. Control measures are not only physical equipment; they may include high-level procedures or detailed operating instructions and information systems. Control measures may be proactive, in that they eliminate, prevent or reduce the likelihood of incidents, or they may be reactive, in that they reduce the consequences of incidents. They must be implemented under and fully supported by the managerial elements of the SMS.

The employer should identify control measures carefully for the MHF, to avoid unnecessary effort or confusion via assessing measures that are not relevant to major accidents. Understanding what part is a control measure, and how it actually controls or affects hazards and risks, is critical to safe operation. This understanding is also essential to the safety report and the associated justification of adequacy of the adopted control measures.

Control measures can be regarded as the barriers between the hazards of an MHF, the occurrence of a major accident as a result of these hazards, and ultimately the harm that may be caused to people, property and the environment in the event of a major accident. This concept is illustrated in Figure 14.

Figure 14: Control measures as barriers to major accidents

Control measures can be identified while identifying hazards and during the risk assessment. Employers should be able to identify a range of control measures immediately, both the existing measures and possible alternatives. Checklists of "typical" control measures may be able to assist in the process, but these should not be used in isolation. The specific nature of each hazard and the associated part of the facility should be considered when identifying control measures. The table below is an example of the consequences and key control measures that might apply for a warehouse.

An example: Identification of scenarios and control measures, dangerous goods warehouse

Scenarios Key Controls

Flash or pool fires from puncturing drums containing flammable liquids.Drum inspection and handling procedures

Ignition source control

Fire fighting equipment

Fires in packaged goods areas, in pallet storage stacks, or amongst general rubbish.Housekeeping

Ignition source control

Smoke detection and automatic vents

Fire escalation.Separation and segregation rules

Stacking restrictions

Fire fighting equipment and emergency response

6.3Understanding control measures

Each identified control measure should be clearly linked to the causes, hazards, major accidents or outcomes they are designed to control. Employers should understand the nature, scale and range of hazards and outcomes each control measure must deal with and the effects each control measure has on these factors. This understanding is required to cover the whole range of conditions that might exist at the facility.

This knowledge provides a clear basis for defining which control measures are critical to safe operation. It also provides a basis for defining performance indicators and standards for control measures.

Using a risk control hierarchy to determine control measures

In an occupational health and safety context, risk control is often categorised according to an effectiveness hierarchy; often simply called the risk control hierarchy. The hierarchy lists the type of control measures in a priority order, based on the extent each measure has an impact on risk.

In the context of MHFs, a useful effectiveness hierarchy of control measures is as follows:

a) eliminate hazards;

b) prevent incidents;

c) reduce consequences; and

d) mitigate the harm.

The different categories are defined below.

The control "hierarchy" in an MHF context:

Control measures that eliminate a hazard are clearly the most effective. If practicable they should be selected in preference to any other type, as their existence removes the need for other controls.

Control measures for prevention are those intended to remove certain causes of incidents or reduce their likelihood. The corresponding hazard remains, but the frequency of incidents involving the hazard is lowered.

Control measures for reduction are those intended to reduce the severity (consequences) of incidents. They include reduction of inventory.

Control measures for mitigation are those that take effect in response to an incident to limit the consequences. They may include fire-fighting systems and, in particular, the emergency response plan. While they may be the "last line of defence", they remain necessary if risk cannot be reduced to a negligible level by other means.

Control measures can also be categorised as hardware (i.e. engineered systems) or software (e.g. management systems or operating /maintenance procedures). Controls may also be grouped into categories that define the nature and spread of the control such as engineering, organisational, procedural and administrative controls. Whatever method of categorisation is employed, safe operation will depend on an appropriate balance of different types of control measures.

These categorisations can help in determining the most effective control measures for a facility and in ensuring a range of measures is chosen so that one failure does not remove many controls.

A single category of control measure will rarely be enough for a risk to be controlled as far as is reasonably practicable unless the elimination of the hazard has occurred. Most commonly, layers of protection will be required to reduce a risk so far as is reasonably practicable.

For most facilities or items of equipment, there are numerous layers acting as barriers to eliminate, prevent, reduce or mitigate incidents. This is illustrated in Figure 15. Equipment integrity, operating and maintenance procedures are the "inner layers", and are the barriers normally relied on to ensure incidents do not occur. Systems that reduce or mitigate incidents are the "outer layers" which are relied on in abnormal or emergency conditions. A robust risk control regime will feature a range of risk control layers; the number and integrity of which should reflect the inherent level of hazard and risk within the protected part of the facility.

Figure 15: Layers of Protection

Examples of control measures are shown below, using the above categorisation. The table is illustrative only, and is not intended to be a complete list of possible controls for any facility. The categorisation shown is not intended to be rigid, and many controls may apply in more than one category.

Start International control measures

TypeEngineering ControlsAdministrative Controls

Elimination Mounding of LPG storage tanks.

Substitution with non-hazardous materials.

Inherent design features, layout. Inherently safe process concept.

Feedstock quality specifications.

Plant design procedures.

Prevention Impact and dropped object barriers.

Isolation valves to enable safe maintenance work.

Mechanical ventilation systems.

Process Control systems.

Corrosion and erosion probes.

Materials specifications, corrosion allowance. Operating procedures and instructions.

Maintenance and isolation procedures.

Management of change.

Reduction Physical barriers between incompatible materials.

Secondary containment of hazardous substances.

Process emergency controls and alarms.

Shutdown, isolation and de-pressurisation systems.

Bursting disks.

Safety and relief valves.

Bunds, other containment and drainage systems. Spill containment and clean-up procedures.

Ignition suppression equipment.

Procedures.

Mitigation Fire detection systems.

Fire suppression and cooling systems.

Passive fire protection systems. Emergency alarms.

Emergency planning and procedures.

Employer-owned buffer zones.

Control measures may vary for different stages of the facility's life cycle. For example, design and construction standards are important for new facilities, but as the facility ages more emphasis may be required on asset integrity management. Similarly, control measures may themselves have life cycles that may need to be considered.

The balance and type of control measures are expected to be consistent with the employers overall safety philosophy. If the safety philosophy is based primarily on engineering controls there is less need for other controls such as administrative ones. On the other hand, if the safety philosophy is based on personnel knowledge and skills, then procedural and competency controls might be dominant, although there would need to be additional hardware controls.

The assessment required to understand control measures, their function and their effects on hazards and associated risks, is driven by three factors:

a) a highly complex reaction process, new technology, or complex process equipment may require detailed assessment to understand the control measures, whereas a simple system can be understood more rapidly and without using sophisticated methods of assessment;

b) where there are numerous options available to control the associated risk, more effort is likely to be required to reach an understanding of the available controls, to differentiate the options in terms of their effects on risk and to provide a basis for selecting or rejecting options appropriately; and

c) a high level of uncertainty regarding the nature of the hazard or risk or the behaviour of the control measures is likely to require greater effort to reach an overall understanding; e.g. Class 6.1 liquids are more straightforward to analyse than Class 2.3 toxic gases.

The above concepts illustrate the issues that need to be considered in defining and understanding control measures. There may be many other issues that need to be considered in developing an understanding of control measures for a facility. For many facilities this may result in a significant amount of information. Therefore a simple method of linking and communicating the information together should be considered, for example "bow tie" diagrams or registers of hazards and controls. Figure 16 provides examples of how to use bow tie diagrams or registers to link and communicate control measure information. Alternatively, simple hazard management tables or diagrams can be developed.

Figure 1: Start Internationals presentation formats for hazard and control information

a) "Bow Tie" diagram

b) Register of hazards and controls

Core questions to ask when selecting or rejecting control measures

Are there controls clearly linked to each hazard, or are there some hazards having no (or insufficient) control measures? Does the number of controls reflect the level of severity of the hazards? The extent of demonstration should be proportional to the level of risk.

What is the functionality of a control measure against the relevant hazards? Is it sufficient to control the hazard in the intended manner, i.e. is it fit for purpose, will it suppress the hazard completely, prevent escalation or simply mitigate effects?

What is the survivability of the control measure in an accident? Is the control measure able to function as intended during the types of accidents it is intended to reduce or mitigate?

Is the reliability of individual control measures, and of all control measures in combination, appropriate to the level of risk presented by the associated hazards? Is function testing sufficiently frequent to detect failures, and will failures once detected be rectified sufficiently promptly?

Has the hierarchy of control measures been considered, with measures to eliminate the hazard adopted first if practicable, followed by measures to prevent, reduce and mitigate?

Is there a balance of different types of control measure for each hazard, i.e. is there a diversity of control measures? Are the control measures associated with individual hazards independent of each other, or can they all be disabled by the same mechanism?

Are the control measures maintainable? For example, are they accessible, can they be maintained (i.e. safety valve with no means for removal/maintenance as it is the only one and must remain in service)?

Are new control measures compatible with the facility, and any other control measures already in use?

Can the control measures be implemented at the facility considering their availability and cost?

6.5Additional or alternative control measures.Employers should objectively challenge how safety is achieved and consider ways to improve safety. This means that alternative control measures not currently in place must be considered alongside existing control measures, and either adopted or rejected according to the results of the risk assessment. In particular, additional controls should be considered if the risk is not reduced as far as practicable or hazards have been identified with no control measures in place. Alternative controls should be identified where the risk is not reduced as far as practicable.

The importance of being prepared to challenge the "norms" of facility operation has been highlighted in past disaster inquiries such as the Longford Royal Commission and the Cullen Inquiry into Piper Alpha.

Therefore the employer should typically consider the following circumstances:

a) existing control measures which are believed to be fully functional and appropriate;

b) existing control measures which may have become disabled, degraded or deficient;

c) existing control measures which function as intended but could be improved;

d) control measures which were considered or used in the past and rejected for some reason;

e) existing control measures which are to be replaced due to obsolescence or old age;

f) new control measures which could replace or add to the existing range of control measures; and

g) new control measures for modifications to the facility.

For many existing facilities, there may be control measures that were adopted or rejected in the past without records to support those decisions. Employers should identify past decisions and control measures that need to be recorded and reviewed, to understand what was done in the past and why it was done, and to maintain the integrity of existing control measures in the future.

This relates to the need for a knowledge base of the control measures on the facility and is an important part of justifying the adequacy of an existing facility in the safety report. Given the potentially large number of decisions and control measures for a typical MHF, which may have decades of operating experience, the employer will need to identify the critical areas that require review, and determine which areas need to be reviewed in brief or in detail.

Circumstances where control measures would require review include:

a) new operating conditions have arisen;

b) knowledge of the basis for safe operation has been lost;

c) there may have been a degradation in effectiveness of existing controls;

d) the knowledge or technology employed is now outdated; and

e) an incident occurred.

The employer should identify both proven technology and newly developed options, as appropriate and not dismiss any option on the grounds that it is "unproven". The process of risk assessment should include the evaluation of new technologies and practices to determine if they are appropriate to the facility.

A reasonable number of existing and alternative control measures should therefore be considered, depending on:

a) the scale and complexity of the facility;

b) the nature of the risk profile; and

c) the rate of development of new technologies and practices.

6.6Defining performance indicators for control measures

Performance indicators for control measures will generally relate to some standards or target levels of performance (performance standards) to ensure safe operation. Performance indicators and the corresponding standards play a vital role in the justification of the adequacy of control measures.

A performance indicator is defined as information that is used to measure the effectiveness of a control, e.g. a test of effectiveness, an indicator of failure, an action taken to report a failure or a corrective action taken in the event of a failure. An indicator is an objective measure, which shows current and/or past performance.A performance standard is defined as the target set for a performance indicator. The standard represents the required performance for the control/SMS (whatever) to be considered effective in managing the risk to ALARP.

Once the employer has decided which control measures are to be adopted, performance indicators must be defined for the control measures, which enable the employer to:

a) measure, monitor or test the effectiveness or failure of each control measure; and

b) determine the best reporting and corrective actions to be taken in the event of failure.

Performance indicators should measure not only how well the control measures can perform, but also how well the management system is monitoring and maintaining them. This shows how performance indicators for control measures overlap with the performance standards required for the SMS.

Some performance indicators and their corresponding performance standards for engineered control measures may be adopted from manufacturer's recommendations; however, the employer should determine if these are appropriate to the specific conditions of the facility.

Performance indicators take many forms, and can be quantitatively or qualitatively expressed. One type of target is a desirable long-term goal or a limit, the breaching of which can be tolerated to a certain extent or under certain conditions in the short-term. Another type of target is one that needs to be achieved within a prescribed timeframe, or where there is zero tolerance for any breaches.

An example of performance indicators for control measures:

A hardware control measure has performance standards relating to its capacity and reliability, plus management system standards for inspection, testing and maintenance, which aim to assure that the capacity and reliability of the control measure are maintained. Performance indicators need to be set that measure performance against these standards.

For example, for a pressure relief valve the performance indicators and standards may relate to:

Min number on-line: x

Min relief rate: y kg/s

Max probability of failure: y%

Max interval between tests : z yrs

One example of a performance indicator that provides a range of acceptable performance is a pre-alarm limit that can be exceeded for a period of abnormal operations provided this is monitored. A performance indicator that does not allow a range of acceptable performance is set at the level of the critical operating parameter (see the section on critical operating parameters).

Performance indicators for control measures should include the following considerations:

a) failure of any control measures - what are the performance requirements for functionality, availability, reliability and survivability of control measures that indicate how or how often the control measures may fail to perform, and what performance standards are required for any activities necessary to achieve these standards?

b) reporting of control measure failures - what activities are necessary to confirm or assure performance, what degree of reporting of failures is required, how quickly will the reporting system identify a failure, and what level of independent verification is needed in addition to routine assurance?

c) corrective action in the event of such failures - what steps are to be taken and how quickly following detection, and what performance standards are required of the corrective process?

Performance indicators can be defined at various levels, e.g. there may be high-level performance indicators as well as lower level and detailed performance indicators. High-level indicators tend to address overall performance issues, for example:

a) employee perceptions, incident rates, improvement programs, availability of control measures which may be taken as indicators of overall safety performance;

b) maintenance of operating conditions within a critical operating envelope, which may indicate overall integrity of the process control regime; and

c) total number of resources dedicated to testing, inspection and maintenance of critical control measures.

Detailed performance indicators tend to relate to individual measures that when combined; contribute to achieving overall high-level performance. At a detailed level, there are many different types of performance indicators that can be defined for each control measure.

When specifying performance indicators or standards, it may be necessary to provide detail on who, what, where, and when for implementation of procedures and activities relating to these indicators and standards. The responsibility for implementation of performance indicators can be defined at a very specific level for each performance standard. For example, responsibility for operational parameters may lie with operations management teams.

Where performance standards relate to control measures, they should be assessed as part of the justification of adequacy. It is also necessary to show that the control measures achieve the standard that has been set. In the simplest cases, performance standards may be industry standards, codes or norms. However, these need to be shown to be appropriate to the specific facility and this can be by a combination of techniques such as:

a) risk assessment results;

b) qualitative argument or reference to the basis for the standard; and

c) cost-benefit or cost-effectiveness analysis of options.

In more complex cases, where there may be no appropriate existing standards, the employer may need to demonstrate the suitability of the performance standard based solely on the risk assessment.

Some examples of performance indicators for control measures:

Management system compliance levels as shown by audit.

Test frequency/interval for safety-critical equipment.

Average skill level of the operations shift personnel.

Compliance level with operating procedures as shown by monitoring.

Number of failures in specific safety devices.

Number of times staffing levels fall below target minimum numbers.

Number of times pressure, temperature etc exceed particular levels.

Measured mechanical integrity (e.g. extent of corrosion).

Detection and response times for unintended material releases.

Sensitivity levels and response times for process alarms.

Compliance levels with manufacturer's or design standards.

Vibration levels in rotating equipment (e.g. compressors).

6.7Critical operating parameters

For the purposes of this booklet, a critical operating parameter (COP) is the upper or lower performance limit of any equipment, process or procedure, compliance with which is necessary to avoid a major accident.

The sum total of COPs may define an overall safe operating envelope for the facility. Control measures are required to prevent COPs being exceeded; in general each COP will have at least one associated control measure, and each control measure will relate to at least one COP. A