Upload
simon-townsend
View
218
Download
1
Embed Size (px)
Citation preview
Safety 4900 CMSU
Outline
• Definitions
• Complexity and catastrophe
• Looking at Systems
• Risk outweigh benefits
• Conclusions.
Safety 4900 CMSU
Normal Accidents
• Normal accidents in a particular system may be common or rare ("It is normal for us to die, but we only do it once."), but the system's characteristics make it inherently vulnerable to such accidents, hence their description as "normal."
Safety 4900 CMSU
Failures
• Discrete Failures– A single, specific, isolated failure is referred to
as a "discrete" failure.
X
Safety 4900 CMSU
Redundant Systems
• Redundant sub-systems provide a backup, an alternate way to control a process or accomplish a task, that will work in the event that the primary method fails. This avoids the "single-point" failure modes.
Safety 4900 CMSU
Interactive Complexity
• A system in which two or more discrete failures can interact in unexpected ways is described as "interactively complex." In many cases, these unexpected interactions can affect supposedly redundant sub-systems. A sufficiently complex system can be expected to have many such unanticipated failure mode interactions, making it vulnerable to normal accidents.
Safety 4900 CMSU
Tight Coupling
• The sub-components of a tightly coupled system have prompt and major impacts on each other. If what happens in one part has little impact on another part, or if everything happens slowly (in particular, slowly on the scale of human thinking times), the system is not described as "tightly coupled." Tight coupling also raises the odds that operator intervention will make things worse, since the true nature of the problem may well not be understood correctly.
Safety 4900 CMSU
Normal Accident
• The interactive complexity and tight coupling– Unexpected interactions will occur– An accident will be reduced.
• Premise: characteristics of system –– Not based on frequency.
Safety 4900 CMSU
NASA View
•NASA nominally works with the theory that accidents can beprevented through good organizational design andmanagement.•Normal accident theory suggests that in complex, tightly coupledsystems, accidents are inevitable.•There are many activities underway to strengthen our safetyposture.•NASA’s new thrust in the analysis of close-calls provides insightinto the unplanned and unimaginable.To defend against normal accidents, we must understand thecomplex interactions of our programs, analyze close-calls and
Safety 4900 CMSU
Definitions
• Complexity – levels of system and organization.
• Coupling - how closely the systems interact.
• Redundant pathway – Backup system that would prevent accidents.
• High Risk – Event with catastrophic potential.
Safety 4900 CMSU
Definitions
• Discrete Failures – failures of isolated single systems
• Interactive Complexity
Safety 4900 CMSU
Questionnaire…
1. Human error results in most accidents
2. Mechanical failure is the highest cause of accidents.
3. The environment impacts the accident.
4. Design of the system is the most important prevention.
5. Procedures are most important.
Safety 4900 CMSU
Answers
• Eighty percent of Accidents are caused by human error.
Safety 4900 CMSU
High Risk Systems
• Creating Systems
• Organizations
• Sub-Organizations
– Understanding how they interact?– Understand the risk?
Safety 4900 CMSU
Three Mile Island
• Four Distinct Failures – Cooling system– Valves closed– Pilot Operated Relief Valve sticks open– False indicators
– These Four occurred in 13 seconds
Safety 4900 CMSU
Three Mile Island
• The Hydrogen Bubble: – Hydrogen produced from zirconium
– Accident Took 33 hours into the accident
– Overpressure was ½ the design strength
Safety 4900 CMSU
Risk & Benefits
• Benefit of Understanding– Reduce Dangers – could TMI happen
again?
– Remove the dangers• Better operator Training (Three E’s)• More Quality Control• Effective Regulation.
Safety 4900 CMSU
High Risk Systems
• Operating Experience – Not sufficient
• Construction – pressure to build
• Safer Designs = less vulnerability?
• Defense in Depth (nuclear term)
Safety 4900 CMSU
Characteristics
• High-Risk Technologies Characteristics(Beyond the toxic, explosive dangers)
–Complexity
–Coupling
Safety 4900 CMSU
Definitions
• ComplexityA system in which two or more discrete failures can interact in unexpected ways is described as "interactively complex." In many cases, these unexpected interactions can affect supposedly redundant sub-systems. A sufficiently complex system can be expected to have many such unanticipated failure mode interactions, making it vulnerable to normal accidents.
Safety 4900 CMSU
Coupling
• CouplingThe sub-components of a tightly coupled system have prompt and major impacts on each other. If what happens in one part has little impact on another part, or if everything happens slowly (in particular, slowly on the scale of human thinking times), the system is not described as "tightly coupled." Tight coupling also raises the odds that operator intervention will make things worse, since the true nature of the problem may well not be understood correctly.
Safety 4900 CMSU
High Complexity• X Fails, Y was out of order• Interaction Unexpected outcome
Piper Alpha
Safety 4900 CMSU
High Complexity
• X Fails, Y Fails, Z was out of order• Interaction Unexpected outcome
Bhopal
Safety 4900 CMSU
Learning from Mistakes?
• Numerous examples given.
• High Risk systems still in use
• Still at risk?
• How do we evaluate this?
Safety 4900 CMSU
Complexity
• Low Complexity – (Linear systems, near linear)
– Result: Accident will not spread or be as serious.
Safety 4900 CMSU
High Complexity Systems
• Not all Interactions known
• Some failure points not identified
Safety 4900 CMSU
Low Complex Characteristics
• Low Complexity (Organization)– Additional Resources available– Time to Spare– Other ways to accomplish task
Safety 4900 CMSU
High Complexity - Organizations
• Large organization– Slow for action
– Complex Systems
– Interconnection
– Contradictions
Safety 4900 CMSU
CMM
• Definition – Complexity Maturity Model
• Reference
• Handout
Safety 4900 CMSU
High Complexity - CMM
Level Focus Key Process
5
Optimizing
Continuous
Improvement
Innovation
FB&I
4
Quantitative
Quantitative Management
Focus on data
3
Defined
Process
Standardized
Training, projects, etc.
2
Repeatable
Basic projects
1
Initial
Competent people and heroes
Safety 4900 CMSU
Coupling
• Coupling. (High)– Processes happen fast– Can’t be turned off– Failed parts can’t be isolated– No other way to keep production going safety
Safety 4900 CMSU
High Coupling - decisions
• Reluctant to shut down
• $ is driver??
• Politics?
• Production?
•Unable to shut down process
•Cost to shut down
•Pressure to shut down
•Damage to shut down
Safety 4900 CMSU
Cost of Shut Down
$300 Million to shut down a Nuclear Power PlantLicense good for 40 years only
Safety 4900 CMSU
Coupling
• Coupling Results:– Recovery is not possible– Disturbance spreads quickly– Irretrievable Results– Operator Action may make it worse
Safety 4900 CMSU
High Complexity and Coupling
• Examples:• Nuclear Power Plants• Laboratories• Industrial Processes
Safety 4900 CMSU
Complex and Linear Interactions
Event disruptsBoth systems
System 1
Sub-system
simultaneous
Sub-system
invisible
Visible
Safety 4900 CMSU
Complex and Linear Interactions
System 1Sub-system
Sub-system
known
Hidden Interactions
Known
unknown
No direct measurement
Safety 4900 CMSU
Linear Systems
Three Characteristics:
1.Tend to be Spread out2.People tend to have less specialized jobs3.Feed back systems are local (not global)
Safety 4900 CMSU
Interactions
•Power Grids •Nuclear Plant
Complex
1 2
3 4•Assembly Line Production
Cou
plin
gLo
ose
Tig
ht
Linear
•Military
•R&D
•Rail Transport
•Dam
•Airways
•DNA
•Trade schools
Marine Transport
•Single-goal agencies
•Mining
Normal Accidents, page 97
Safety 4900 CMSU
Interactions
•Power Grids •Nuclear Plant
Complex
1 2
3 4•Assembly Line Production
Cou
plin
gLo
ose
Tig
ht
Linear
•Military
•R&D
•Rail Transport
•Dam
•Airways
•DNA
•Trade schools
Marine Transport
•Single-goal agencies
•Mining
Increasing Risk
Tim
e D
epen
dant
Safety 4900 CMSU
McCabe’s Complexity
Cyclomatic Complexity Risk Evaluation
1-10a simple program, without much risk
11-20more complex, moderate risk
21-50complex, high risk program
greater than 50untestable program (very high risk)
Safety 4900 CMSU
Other FactorsComplexity Measurement Primary Measure of
Halstead Complexity Measures Algorithmic complexity, measured by counting operators and operands
Henry and Kafura metrics Coupling between modules (parameters, global variables, calls)
Bowles metrics Module and system complexity; coupling via parameters and global variables
Troy and Zweben metrics Modularity or coupling; complexity of structure (maximum depth of structure chart); calls-to and called-by
Ligier metrics Modularity of the structure chart
Table 5: Other Facets of Complexity
Safety 4900 CMSU
Index ScoreName of technology Cyclomatic Complexity
Application category Test (AP.1.4.3)Reapply Software Lifecyle (AP.1.9.3)Reverse Engineering (AP.1.9.4)Reengineering (AP.1.9.5)
Quality measures category Maintainability (QM.3.1)Testability (QM.1.4.1)Complexity (QM.3.2.1)Structuredness (QM.3.2.3)
Computing reviews category Software Engineering Metrics (D.2.8)Complexity Classes (F.1.3)Tradeoffs Among Complexity Measures (F.
Safety 4900 CMSU
McCabe’s Equation
M = E − N + 2Pwhere
M = cyclomatic complexityE = the number of edges of the graph (loops)N = the number of nodes of the graphP = the number of connected components.
"M" is alternatively defined to be one larger than the number of decision points (if/case-statements, while-statements, etc) in a module (function, procedure, chart node, etc.), or more generally a system.Separate subroutines are treated as being independent, disconnected components of the program's control flow graph.
Safety 4900 CMSU
Determining Coupling Score -Population
Population Score
1-1000 1
1000-2000 2
2000-4000 3-4
5000+ 5
Cp=Population/1000 + Nodes
Safety 4900 CMSU
Coupling & Complexity Score
Condition (Nodes) Score
1-2 1
2-5 2
5-10 3
10-20 4
McCabe’s Model for complexity
Complex = Cyclomatic + CMM+system
Safety 4900 CMSU
CMM Table
Level 1 : ChaoticLevel 2 RepeatableLevel 3 DefinedLevel 4 ManagedLevel 5 Optimized
Safety 4900 CMSU
Complexity Score - Capability Maturity Model
CMM Score
Initial 5
Repeatable Basic Project Management 4
Defined – standard 3
Quantitative 2
Optimizing Continuous Process Improvement
1
McCabe’s Model for complexityComplex = Cyclomatic + CMM+ system
Safety 4900 CMSU
Complexity Score – System Levels
System LevelDisruption to single element – valve 1
Disruption to single unit (steam generator) (subsystems)
2
Inclusion of secondary systems. 3-4
Entire System 5
McCabe’s Model for complexity
Complex = Cyclomatic + CMM + System
Safety 4900 CMSU
Complexity Score – Linearity
Linearity LevelLinearity – production line, easy to detect failure, ease to see result
1
Near Linear – production with branching 2
Proximity sources/systems. To indirect systems, Branching with some FB systems.
3-4
Non-linear, indirect systems 5
McCabe’s Model for complexity
Complex = Cyclomatic + CMM + System
Safety 4900 CMSU
Complexity Score – System Levels
McCabe’s Model for complexity
Complex = Cyclomatic + CMM + System
Example:
Coupling = (5000/1000) + 4 (nodes) = 9
(nodes) CMM + System + Linearity Complexity = 4 + 3 + 2 + 4 = 13
Safety 4900 CMSU
Recommendations
2 4 6 8 10 12 14 16 18 20
NuclearEnergy
Tolerate & Improve
ComplexityC
oupl
ing
1 2
3
4
5
6
7
8
9
10
AbandonRestrict
Page 349, Normal Accidents
Safety 4900 CMSU
Recommendations
2 4 6 8 10 12 14 16 18 20
NuclearEnergyMarine
Transport
DNA ResearchTolerate
& Improve
Dams
Mining
ComplexityC
oupl
ing
1 2
3
4
5
6
7
8
9
10 Space
Flying
AbandonRestrict
Normal Accidents, page 349
Safety 4900 CMSU
Recommendations
2 4 6 8 10 12 14 16 18 20
NuclearEnergyMarine
Transport
DNA ResearchTolerate
& Improve
Dams
Mining
ComplexityC
oupl
ing
1 2
3
4
5
6
7
8
9
10 Space
Flying
AbandonRestrict
What can be changed? Complexity or Coupling?
Safety 4900 CMSU
Recommendations
2 4 6 8 10 12 14 16 18 20
NuclearEnergyMarine
Transport
DNA ResearchTolerate
& Improve
Dams
Mining
Net Catastrophic PotentialC
ost
of A
ltera
tion
1 2
3
4
5
6
7
8
9
10 Space
Flying
AbandonRestrict
Safety 4900 CMSU
Usage
• Different Approach to Risk
• Method for Management Systems
• Analysis of what systems need attention
Safety 4900 CMSU
Warning Signs
• Are there warning signs prior to each accident?– NASA– Chernobyl– Three Mile Island
Safety 4900 CMSU
Warning Signs
• System Dependant?
• High or low Complexity?
• Low or High technology?
Safety 4900 CMSU
Incubation Perid:Warning Signs
Events go unnoticed or misunderstood because:
•Erroneous Assumptions•Difficulties in handling information in complex situations•Ambiguous regulation•Reluctance to fear the worst outcome
Safety 4900 CMSU
Disaster Model
Epistemic blind spots
Risk Denial
Structural Impediments
Warning signals
Signals not processedOr not acted on
Problem build up.
Large Scalefailure
Safety 4900 CMSU
Failures - Epistemic
• Tend to favor Information that confirms their beliefs
• Rather than consider how they should change
Safety 4900 CMSU
Failures - Epistemic
• Result:– Ignore information– Question reliability– Re-interpret it’s significance
Safety 4900 CMSU
Failure - Organizational
• Follow a justification approach
• Incontestable beliefs
• Ignore information that contradicts
• Rarely abandon beliefs in favor of others
Safety 4900 CMSU
Case Studies
• VIOXX– Introduced in 1999– Excessive Risk of heart attack in 2000– Caution flag on cardiovascular in 2001– Withdrawn in 2004.
Safety 4900 CMSU
Structural Impediments
• Warning Recognized
• Structural Rules, roles, differentiation
• Result: Information incomplete
Response ineffective
Safety 4900 CMSU
Structural Impairments
• Differentiation– Horizontal: (Stove piping) no talk across
organizations or divisions• Diffuse – no clear owner, focus on their part
– Vertical: Defer to others up or down• Strong heirarchal
Safety 4900 CMSU
Accident Causation Model
Organization
Workplace
Individual
CultureDecision processes
Time PressureInsufficient ResourcesInadequate trainingFatigueInformation overload
National Limitations
Factors combineAnd increase
pressure
Safety 4900 CMSU
Individual Factors
• Information that matches beliefs
• Process information selectively
• Justify desired conclusions
• Rely on stereotypes
• Be aware of biases and blindspots
Safety 4900 CMSU
Solutions
• Be aware of blindspots
• Apply different frames of reference
• Imagine improbable outcomes
• Imagine unpopular outcomes
• Listen to different view points
• Use theories and models
Safety 4900 CMSU
Solutions• Question set
– Do I have trouble defining what would be a failure for this decision?
– Would failure in this project radically change the way I think of myself or manager?
– If I took over this job for the first time, would I get rid of it?
Safety 4900 CMSU
Work Group Factors
• Group Think – strive for concurrence
• Group Polarization – group takes on more risky decision than each member
– Solutions:Encourage Openness
– Reduce social pressures– Frank discussion
Safety 4900 CMSU
Organizational
• Bureaucratic culture and information dispersion
• Responsibility is defined
• Actions are taken
• Compartmentalization
• Information Sharing
• Solutions?
Safety 4900 CMSU
Organization
• Solutions:– Actively seek out new information– Seek out discrepant information– Share information– Share responsibility– Reward information sharing– Welcome new ideas or alternative
interpretations
Safety 4900 CMSU
Organizational
• Information Dispersion
• Consequence of organizational structure
• Detection more difficult
Group 1
Group 2
Group 3
Group 4
Partial info
Safety 4900 CMSU
Solutions:
• High Reliability Organizations (HRO)
• Five information Priorities– Possibility of Failure– Avoiding Failure– Encourage error reporting– Analyze experiences of near misses– Resist complacency
Safety 4900 CMSU
HRO
• Also:– Recognize complexity rather than simplified– Seek more complete picture– Attentive to operations at front line– Notice anomalies early– Tract Anomalies– Develop capabilities to detect, contain,
bounce back from errors
Safety 4900 CMSU
Safety and HealthWe believe that all injuries and occupational illnesses, as well as safety and environmental incidents, are preventable, and our goal for all of them is zero. The site has earned the "Sentinels of Safety" award from the Mine Safety and Health Administration 17 times!
Safety 4900 CMSU
• Sentinels of Safety
• Based on accident rates
Mine Safety and Health Administration