110
Safety 4900 CMSU Living with High Risk Technologies Charles Perrow, “Normal Accidents”

Safety 4900 CMSU Living with High Risk Technologies Charles Perrow, “Normal Accidents”

Embed Size (px)

Citation preview

Safety 4900 CMSU

Living with High Risk Technologies

Charles Perrow, “Normal Accidents”

Safety 4900 CMSU

Technology

• First Picture of Water on Mars!

Safety 4900 CMSU

Safety 4900 CMSU

What is a Normal Accident?

Safety 4900 CMSU

Outline

• Definitions

• Complexity and catastrophe

• Looking at Systems

• Risk outweigh benefits

• Conclusions.

Safety 4900 CMSU

Normal Accident

• synonym for "inevitable accidents."

Safety 4900 CMSU

Normal Accidents

• Normal accidents in a particular system may be common or rare ("It is normal for us to die, but we only do it once."), but the system's characteristics make it inherently vulnerable to such accidents, hence their description as "normal."

Safety 4900 CMSU

Safety 4900 CMSU

Failures

• Discrete Failures– A single, specific, isolated failure is referred to

as a "discrete" failure.

X

Safety 4900 CMSU

Redundant Systems

• Redundant sub-systems provide a backup, an alternate way to control a process or accomplish a task, that will work in the event that the primary method fails. This avoids the "single-point" failure modes.

Safety 4900 CMSU

Interactive Complexity

• A system in which two or more discrete failures can interact in unexpected ways is described as "interactively complex." In many cases, these unexpected interactions can affect supposedly redundant sub-systems. A sufficiently complex system can be expected to have many such unanticipated failure mode interactions, making it vulnerable to normal accidents.

Safety 4900 CMSU

Tight Coupling

• The sub-components of a tightly coupled system have prompt and major impacts on each other. If what happens in one part has little impact on another part, or if everything happens slowly (in particular, slowly on the scale of human thinking times), the system is not described as "tightly coupled." Tight coupling also raises the odds that operator intervention will make things worse, since the true nature of the problem may well not be understood correctly.

Safety 4900 CMSU

Normal Accident

• The interactive complexity and tight coupling– Unexpected interactions will occur– An accident will be reduced.

• Premise: characteristics of system –– Not based on frequency.

Safety 4900 CMSU

NASA View

•NASA nominally works with the theory that accidents can beprevented through good organizational design andmanagement.•Normal accident theory suggests that in complex, tightly coupledsystems, accidents are inevitable.•There are many activities underway to strengthen our safetyposture.•NASA’s new thrust in the analysis of close-calls provides insightinto the unplanned and unimaginable.To defend against normal accidents, we must understand thecomplex interactions of our programs, analyze close-calls and

Safety 4900 CMSU

Definitions

• Complexity – levels of system and organization.

• Coupling - how closely the systems interact.

• Redundant pathway – Backup system that would prevent accidents.

• High Risk – Event with catastrophic potential.

Safety 4900 CMSU

Definitions

• Discrete Failures – failures of isolated single systems

• Interactive Complexity

Safety 4900 CMSU

Definitions

• Systems–Individual Components

–Interactions

–Feedback systems

Safety 4900 CMSU

Questionnaire…

1. Human error results in most accidents

2. Mechanical failure is the highest cause of accidents.

3. The environment impacts the accident.

4. Design of the system is the most important prevention.

5. Procedures are most important.

Safety 4900 CMSU

Answers

• Eighty percent of Accidents are caused by human error.

Dave Barber
Answer questions frm previous slides

Safety 4900 CMSU

High Risk Systems

• Creating Systems

• Organizations

• Sub-Organizations

– Understanding how they interact?– Understand the risk?

Safety 4900 CMSU

Systems

• Human Interface – complexity/saturation

Safety 4900 CMSU

Three Mile Island

• Four Distinct Failures – Cooling system– Valves closed– Pilot Operated Relief Valve sticks open– False indicators

– These Four occurred in 13 seconds

Safety 4900 CMSU

Three Mile Island• Diagram:

Safety 4900 CMSU

Three Mile Island

• The Hydrogen Bubble: – Hydrogen produced from zirconium

– Accident Took 33 hours into the accident

– Overpressure was ½ the design strength

Safety 4900 CMSU

Errors

• Familiar with System

• System Design flaws

Safety 4900 CMSU

Risk & Benefits

• Benefit of Understanding– Reduce Dangers – could TMI happen

again?

– Remove the dangers• Better operator Training (Three E’s)• More Quality Control• Effective Regulation.

Safety 4900 CMSU

High Risk Systems

• Operating Experience – Not sufficient

• Construction – pressure to build

• Safer Designs = less vulnerability?

• Defense in Depth (nuclear term)

Safety 4900 CMSU

Characteristics

• High-Risk Technologies Characteristics(Beyond the toxic, explosive dangers)

–Complexity

–Coupling

Safety 4900 CMSU

Definitions

• ComplexityA system in which two or more discrete failures can interact in unexpected ways is described as "interactively complex." In many cases, these unexpected interactions can affect supposedly redundant sub-systems. A sufficiently complex system can be expected to have many such unanticipated failure mode interactions, making it vulnerable to normal accidents.

Safety 4900 CMSU

Coupling

• CouplingThe sub-components of a tightly coupled system have prompt and major impacts on each other. If what happens in one part has little impact on another part, or if everything happens slowly (in particular, slowly on the scale of human thinking times), the system is not described as "tightly coupled." Tight coupling also raises the odds that operator intervention will make things worse, since the true nature of the problem may well not be understood correctly.

Safety 4900 CMSU

High Complexity• X Fails, Y was out of order• Interaction Unexpected outcome

Piper Alpha

Safety 4900 CMSU

High Complexity

• X Fails, Y Fails, Z was out of order• Interaction Unexpected outcome

Bhopal

Safety 4900 CMSU

Learning from Mistakes?

• Numerous examples given.

• High Risk systems still in use

• Still at risk?

• How do we evaluate this?

Safety 4900 CMSU

Complexity

• Low Complexity – (Linear systems, near linear)

– Result: Accident will not spread or be as serious.

Safety 4900 CMSU

High Complexity Systems

• Not all Interactions known

• Some failure points not identified

Safety 4900 CMSU

Normal Accidents

• Why haven’t we had more?

Safety 4900 CMSU

Low Complex Characteristics

• Low Complexity (Organization)– Additional Resources available– Time to Spare– Other ways to accomplish task

Safety 4900 CMSU

High Complexity - Organizations

• Large organization– Slow for action

– Complex Systems

– Interconnection

– Contradictions

Safety 4900 CMSU

CMM

• Definition – Complexity Maturity Model

• Reference

• Handout

Dave Barber
Add this information!

Safety 4900 CMSU

CMM Scoring

• One Method

Safety 4900 CMSU

High Complexity - CMM

Level Focus Key Process

5

Optimizing

Continuous

Improvement

Innovation

FB&I

4

Quantitative

Quantitative Management

Focus on data

3

Defined

Process

Standardized

Training, projects, etc.

2

Repeatable

Basic projects

1

Initial

Competent people and heroes

Safety 4900 CMSU

Coupling

Definition:

Example:

Safety 4900 CMSU

Coupling

• Coupling. (High)– Processes happen fast– Can’t be turned off– Failed parts can’t be isolated– No other way to keep production going safety

Safety 4900 CMSU

High Coupling - decisions

• Reluctant to shut down

• $ is driver??

• Politics?

• Production?

•Unable to shut down process

•Cost to shut down

•Pressure to shut down

•Damage to shut down

Safety 4900 CMSU

Cost of Shut Down

$300 Million to shut down a Nuclear Power PlantLicense good for 40 years only

Safety 4900 CMSU

Coupling

• Coupling Results:– Recovery is not possible– Disturbance spreads quickly– Irretrievable Results– Operator Action may make it worse

Safety 4900 CMSU

How it Happens?

• Normal Accident:

Interactive Complexity and Tight Coupling

Safety 4900 CMSU

High Complexity and Coupling

• Examples:• Nuclear Power Plants• Laboratories• Industrial Processes

Safety 4900 CMSU

Complex and Linear Interactions

Event disruptsBoth systems

System 1

Sub-system

simultaneous

Sub-system

invisible

Visible

Safety 4900 CMSU

Example

• Chernobyl–Hot spot was not visible

–Graphite rod affects

Safety 4900 CMSU

Complex and Linear Interactions

System 1Sub-system

Sub-system

known

Hidden Interactions

Known

unknown

No direct measurement

Safety 4900 CMSU

Design Issues

• Hidden elements

• Control Systems

Safety 4900 CMSU

Linear Systems

Three Characteristics:

1.Tend to be Spread out2.People tend to have less specialized jobs3.Feed back systems are local (not global)

Safety 4900 CMSU

Break!

Safety 4900 CMSU

Interactions

•Power Grids •Nuclear Plant

Complex

1 2

3 4•Assembly Line Production

Cou

plin

gLo

ose

Tig

ht

Linear

•Military

•R&D

•Rail Transport

•Dam

•Airways

•DNA

•Trade schools

Marine Transport

•Single-goal agencies

•Mining

Normal Accidents, page 97

Safety 4900 CMSU

Interactions

•Power Grids •Nuclear Plant

Complex

1 2

3 4•Assembly Line Production

Cou

plin

gLo

ose

Tig

ht

Linear

•Military

•R&D

•Rail Transport

•Dam

•Airways

•DNA

•Trade schools

Marine Transport

•Single-goal agencies

•Mining

Increasing Risk

Tim

e D

epen

dant

Safety 4900 CMSU

Methods

Not Described in BookDerived From information

Used McCabe’s Model

Safety 4900 CMSU

McCabe’s Complexity

Cyclomatic Complexity Risk Evaluation

1-10a simple program, without much risk

11-20more complex, moderate risk

21-50complex, high risk program

greater than 50untestable program (very high risk)

Safety 4900 CMSU

Other FactorsComplexity Measurement Primary Measure of

Halstead Complexity Measures Algorithmic complexity, measured by counting operators and operands

Henry and Kafura metrics Coupling between modules (parameters, global variables, calls)

Bowles metrics Module and system complexity; coupling via parameters and global variables

Troy and Zweben metrics Modularity or coupling; complexity of structure (maximum depth of structure chart); calls-to and called-by

Ligier metrics Modularity of the structure chart

Table 5: Other Facets of Complexity

Safety 4900 CMSU

Index ScoreName of technology Cyclomatic Complexity

Application category Test (AP.1.4.3)Reapply Software Lifecyle (AP.1.9.3)Reverse Engineering (AP.1.9.4)Reengineering (AP.1.9.5)

Quality measures category Maintainability (QM.3.1)Testability (QM.1.4.1)Complexity (QM.3.2.1)Structuredness (QM.3.2.3)

Computing reviews category Software Engineering Metrics (D.2.8)Complexity Classes (F.1.3)Tradeoffs Among Complexity Measures (F.

Safety 4900 CMSU

McCabe’s Equation

M = E − N + 2Pwhere

M = cyclomatic complexityE = the number of edges of the graph (loops)N = the number of nodes of the graphP = the number of connected components.

"M" is alternatively defined to be one larger than the number of decision points (if/case-statements, while-statements, etc) in a module (function, procedure, chart node, etc.), or more generally a system.Separate subroutines are treated as being independent, disconnected components of the program's control flow graph.

Safety 4900 CMSU

Determining Coupling Score -Population

Population Score

1-1000 1

1000-2000 2

2000-4000 3-4

5000+ 5

Cp=Population/1000 + Nodes

Safety 4900 CMSU

Coupling & Complexity Score

Condition (Nodes) Score

1-2 1

2-5 2

5-10 3

10-20 4

McCabe’s Model for complexity

Complex = Cyclomatic + CMM+system

Safety 4900 CMSU

Complexity Maturity Model

Used For Software Systems (and others)

Safety 4900 CMSU

CMM Table

Level 1 : ChaoticLevel 2 RepeatableLevel 3 DefinedLevel 4 ManagedLevel 5 Optimized

Safety 4900 CMSU

Complexity Score - Capability Maturity Model

CMM Score

Initial 5

Repeatable Basic Project Management 4

Defined – standard 3

Quantitative 2

Optimizing Continuous Process Improvement

1

McCabe’s Model for complexityComplex = Cyclomatic + CMM+ system

Safety 4900 CMSU

Complexity Score – System Levels

System LevelDisruption to single element – valve 1

Disruption to single unit (steam generator) (subsystems)

2

Inclusion of secondary systems. 3-4

Entire System 5

McCabe’s Model for complexity

Complex = Cyclomatic + CMM + System

Safety 4900 CMSU

Complexity Score – Linearity

Linearity LevelLinearity – production line, easy to detect failure, ease to see result

1

Near Linear – production with branching 2

Proximity sources/systems. To indirect systems, Branching with some FB systems.

3-4

Non-linear, indirect systems 5

McCabe’s Model for complexity

Complex = Cyclomatic + CMM + System

Safety 4900 CMSU

Complexity Score – System Levels

McCabe’s Model for complexity

Complex = Cyclomatic + CMM + System

Example:

Coupling = (5000/1000) + 4 (nodes) = 9

(nodes) CMM + System + Linearity Complexity = 4 + 3 + 2 + 4 = 13

Safety 4900 CMSU

Total Scoring

Average the Scores together?Average over time?

C =

systems

0 to n

Safety 4900 CMSU

Three Actions

AbandonRestrictTolerate & Improve

Safety 4900 CMSU

Recommendations

2 4 6 8 10 12 14 16 18 20

NuclearEnergy

Tolerate & Improve

ComplexityC

oupl

ing

1 2

3

4

5

6

7

8

9

10

AbandonRestrict

Page 349, Normal Accidents

Safety 4900 CMSU

Recommendations

2 4 6 8 10 12 14 16 18 20

NuclearEnergyMarine

Transport

DNA ResearchTolerate

& Improve

Dams

Mining

ComplexityC

oupl

ing

1 2

3

4

5

6

7

8

9

10 Space

Flying

AbandonRestrict

Normal Accidents, page 349

Safety 4900 CMSU

Recommendations

2 4 6 8 10 12 14 16 18 20

NuclearEnergyMarine

Transport

DNA ResearchTolerate

& Improve

Dams

Mining

ComplexityC

oupl

ing

1 2

3

4

5

6

7

8

9

10 Space

Flying

AbandonRestrict

What can be changed? Complexity or Coupling?

Safety 4900 CMSU

Recommendations

2 4 6 8 10 12 14 16 18 20

NuclearEnergyMarine

Transport

DNA ResearchTolerate

& Improve

Dams

Mining

Net Catastrophic PotentialC

ost

of A

ltera

tion

1 2

3

4

5

6

7

8

9

10 Space

Flying

AbandonRestrict

Safety 4900 CMSU

Usage

• Different Approach to Risk

• Method for Management Systems

• Analysis of what systems need attention

Safety 4900 CMSU

Culture Group?

• Hierarchy

• Individualist

• Egalitarian

• Isolationist

• Hermits

Safety 4900 CMSU

Questions?

• Application to Today?

• Information was useful?

Safety 4900 CMSU

Organizational Disasters

• Why do they happen?

Safety 4900 CMSU

Safety 4900 CMSU

Warning Signs

• Are there warning signs prior to each accident?– NASA– Chernobyl– Three Mile Island

Safety 4900 CMSU

Warning Signs

• System Dependant?

• High or low Complexity?

• Low or High technology?

Safety 4900 CMSU

Incubation Perid:Warning Signs

Events go unnoticed or misunderstood because:

•Erroneous Assumptions•Difficulties in handling information in complex situations•Ambiguous regulation•Reluctance to fear the worst outcome

Safety 4900 CMSU

Disaster Model

Epistemic blind spots

Risk Denial

Structural Impediments

Warning signals

Signals not processedOr not acted on

Problem build up.

Large Scalefailure

Safety 4900 CMSU

Failures - Epistemic

• Why?

• Human’s process information selectively.

Safety 4900 CMSU

Failures - Epistemic

• Tend to favor Information that confirms their beliefs

• Rather than consider how they should change

Safety 4900 CMSU

Failures - Epistemic

• Result:– Ignore information– Question reliability– Re-interpret it’s significance

Safety 4900 CMSU

Failure - Organizational

• Follow a justification approach

• Incontestable beliefs

• Ignore information that contradicts

• Rarely abandon beliefs in favor of others

Safety 4900 CMSU

Failure – Risk Denial

• Values

• Norms

• Priorities

Safety 4900 CMSU

Failure - Denial

• Result: No corrective action

Safety 4900 CMSU

Case Studies

• VIOXX– Introduced in 1999– Excessive Risk of heart attack in 2000– Caution flag on cardiovascular in 2001– Withdrawn in 2004.

Safety 4900 CMSU

Structural Impediments

• Warning Recognized

• Structural Rules, roles, differentiation

• Result: Information incomplete

Response ineffective

Safety 4900 CMSU

Structural Impairments

• Differentiation– Horizontal: (Stove piping) no talk across

organizations or divisions• Diffuse – no clear owner, focus on their part

– Vertical: Defer to others up or down• Strong heirarchal

Safety 4900 CMSU

Prevention

• Accident Development Model

• Three levels:

Safety 4900 CMSU

Accident Causation Model

Organization

Workplace

Individual

CultureDecision processes

Time PressureInsufficient ResourcesInadequate trainingFatigueInformation overload

National Limitations

Factors combineAnd increase

pressure

Safety 4900 CMSU

Individual Factors

• Information that matches beliefs

• Process information selectively

• Justify desired conclusions

• Rely on stereotypes

• Be aware of biases and blindspots

Safety 4900 CMSU

Solutions

• Be aware of blindspots

• Apply different frames of reference

• Imagine improbable outcomes

• Imagine unpopular outcomes

• Listen to different view points

• Use theories and models

Safety 4900 CMSU

Solutions• Question set

– Do I have trouble defining what would be a failure for this decision?

– Would failure in this project radically change the way I think of myself or manager?

– If I took over this job for the first time, would I get rid of it?

Safety 4900 CMSU

Work Group Factors

• Group Think – strive for concurrence

• Group Polarization – group takes on more risky decision than each member

– Solutions:Encourage Openness

– Reduce social pressures– Frank discussion

Safety 4900 CMSU

Organizational

• Bureaucratic culture and information dispersion

• Responsibility is defined

• Actions are taken

• Compartmentalization

• Information Sharing

• Solutions?

Safety 4900 CMSU

Organization

• Solutions:– Actively seek out new information– Seek out discrepant information– Share information– Share responsibility– Reward information sharing– Welcome new ideas or alternative

interpretations

Safety 4900 CMSU

Organizational

• Information Dispersion

• Consequence of organizational structure

• Detection more difficult

Group 1

Group 2

Group 3

Group 4

Partial info

Safety 4900 CMSU

Organizational

HierIsolation

EgalIndividual

Safety 4900 CMSU

Organization

• Solutions:– Redundancy in information

• Factors: $

Safety 4900 CMSU

Solutions

• Cultivate a Safety-oriented Information Culture

Safety 4900 CMSU

Solutions:

• High Reliability Organizations (HRO)

• Five information Priorities– Possibility of Failure– Avoiding Failure– Encourage error reporting– Analyze experiences of near misses– Resist complacency

Safety 4900 CMSU

HRO

• Also:– Recognize complexity rather than simplified– Seek more complete picture– Attentive to operations at front line– Notice anomalies early– Tract Anomalies– Develop capabilities to detect, contain,

bounce back from errors

Safety 4900 CMSU

Safety and HealthWe believe that all injuries and occupational illnesses, as well as safety and environmental incidents, are preventable, and our goal for all of them is zero. The site has earned the "Sentinels of Safety" award from the Mine Safety and Health Administration 17 times!

Safety 4900 CMSU

• Sentinels of Safety

• Based on accident rates

Mine Safety and Health Administration

Safety 4900 CMSU

HRO

• Thoughts?

• Who is in a HRO?