18
Software Safety CS3300 Fall 2015

Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Embed Size (px)

Citation preview

Page 1: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Software SafetyCS3300Fall 2015

Page 2: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Failures are costly● Bhopal 1984 – 3000 dead and 200000 injured● Therac-25 1987 – 6 dead● Chernobyl / Three Mile Island● Challenger Shuttle

Page 3: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Culture● The general attitude and approach to safety

reflected by those who participate in that industry.

● Accidents come from:● Overconfidence and complacency

– Three mile island – faith in equipment– Shuttle – routine operations

● Disregard or low priority for safety– Bhopal – training cuts for staff

● Flawed resolution of conflicting goals– Challenger – pressure to lauch vs. safety

Page 4: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

System Safety● Build in safety, not add it on later● Consider system as a whole, not subsystems● Take larger view of hazards than just failures● Analysis over experience and standards● Qualitative over quantitative approach● Tradeoffs and conflicts are important in design● System Safety is more than systems

engineering

Page 5: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Definitions● Reliability: probability that a piece of equipment

will perform its intended function satisfactorily for a prescribed time under stipulated conditions

● Failure: nonperformance or inability of the system to meet its intended function for a specified time under specified conditions

● Error: design flaw or deviation from intended state● Accident: Undesired and unplanned (but maybe

not unexpected) event that results in loss

Page 6: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

More definitions● Hazard: set of conditions that together with

environment will lead to an accident● Risk: hazard level combined with likelihood of

accident (danger) and hazard duration (latency)

● Safety: Freedom from accident or loss

Page 7: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Models of Accident Analysis

● Domino Theory● Environment > Person > Act > Accident > Injury

● National Safety Council● background factors > initiating factors >

intermediate factors > immediate factors > measurable results

● unsafe condition > agent of accident > increase in potential > unsafe act > injury

● Chain of Events – single event sets off path to accident

Page 8: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Fault Tree AnalysisWrong Treatment

for patient

Vital Signs showIncorrect state

Correct state butUntimely reaction

OR

MeasurementFrequency

Too low

Computer failsTo raisealarm

Vital signsNot reported

Nurse doesNot respond

OR

Computer doesNot read within

Time limit

Human setsFrequency too

low

OR

SensorFailure

Human errorOn input

AND

Page 9: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Event Tree

PressureToo High

ReliefValve 1

ReliefValve 2

Opens

Opens

Fails

Fails

PressureDecrease

PressureDecrease

Explosion

Page 10: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Software Safety● Requirements Complete/Consistent● Behavior deterministic● Robustness

● Every state must have a transition for every possible event

● Transitions out of a state must form tautology● Behavior specified for timeouts (no input for some

period)

● HCI Criteria● Safety critical outputs checked for

reasonableness

Page 11: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

● All specified states must be reachable from start state

● No paths to unplanned hazardous states● Every hazardous state has a path to a safe

state

Page 12: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

Safety Design● Hazard elimination (Substitution,

Simplification, Decoupling, Human error, materials)

● Hazard reduction (Controllability, safety factors, redundancy, recovery)

● Hazard control (limit exposure, isolation, protection system)

● Damage minimization

Page 13: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

New Model

● STAMP : Systems Theoretic Accident Model and Process

● STPA : System-Theoretic Process Analysis

Page 14: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

STPA Step 1● Identify potential for inadequate control of the

system● A control action required for safety not provided or

followed● An unsafe control action is provided● A potentially safe action is provided at wrong time● A control action required for safety is stopped too

soon or applied too long

Page 15: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

STPA Step 2● Determine how each action in step 1 could

occur● Examine parts of control loop. Design controls

and measures. For multiple controllers, identify conflicts and coordination problems

● Consider how controls could degrade over time and build in protection– Management of procedures– Audits– Accident and incident analysis

Page 16: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

NASA 10 Rules of Development

Language Choice == Not one of the rules but very important. Need mature compilers with extensive code checking.● 1. Restrict all code to very simple control

structures. No goto, setjmp/longjmp, direct or indirect recursion.

● 2. All loops must have a fixed upper bound. Must be able to prove a loop terminates. Exception are for loops not meant to terminate, then inverse of above must be true.

Page 17: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

NASA 10 rules3. No dynamic memory allocation after

initialization. Garbage collectors and malloc can cause unpredictable issues.

4. no function longer than can be printed on a single sheet of paper.

5. code should average 2 assertions per function. Assertions must be side-effect free.

6. Data items must be declared at the smallest possible scope.

7. Return value of non-void functions must be checked.

Page 18: Software Safety CS3300 Fall 2015. Failures are costly ● Bhopal 1984 – 3000 dead and 200000 injured ● Therac-25 1987 – 6 dead ● Chernobyl / Three Mile

NASA 10 rules8. Use of preprocessor must be limited to header

inclusion and simple macros

9. Use of pointers should be limited. No more than one level of dereferencing is allowed. Function pointers are not permitted.

10. Code must be compiled with all compiler warnings enabled at most pedantic setting. Code must not have any warnings. Code must be checked by 2 different code analyzers and pass with no warnings.